Please subscribe to watch this video

Sign Up

Comments

Video transcript

Hi there, in this video we're going to look at removing duplicate contents. It's a good way to check to make sure you haven't got duplicates. I've definitely got some in here, I can see a couple right there.There's a 'Leslie Glover', and they're exactly the same, and I want to remove them, but it's a big long list, I want to go and automatically check it.

There's going to be an instance though where-- these ones are easy to find because they're exactly the same. There's no difference between these two. All the way along. But there's another one down here with'Haty Chavez'. They're exactly the same except they got different times. It might be that they got a time stamp of when they were imported so there's two different ways we're going to do this.

So what we need to do is select the whole data set. And we can do that now that we've cleared our blank lines. Before we had to kind of click this first one, go down here, hold 'Shift'. What you can do is click anywhere in your data set, and go 'Control A'. And it's going to pick the whole thing. It won't do that if you've got blank spaces in any of these rows. So I've got the whole thing. The easiest one is just to remove the ones, go to 'Data', and let's go to 'Remove Duplicates'.

We're going to leave it on 'Select All'. And we're going to click 'OK'. It's going to get rid of two of them. I know there's four, so it's kind of done it okay. What I'd like to do after that is, just to kind of highlightanything, so-- that might be all you need, and you can finish this video, but let's say we want to just double check.

It's quite important, I do this quite a bit for my subscription service. I want to send a specific email to people who have signed up for my free stuff, but in different email from the people that have paid. I don't want to hassle the people that have already paid, and say, "Hey, come pay." I want to make sure that  emails goes out to people that are-- so what we're going to do is-- a cool little thing to do, select anywhere in these cells, 'Control A'. Selects the whole data set, and we're going to go back to 'Home'. And there's one in here called 'Conditional Formatting'. Drop that down, go to 'Highlight Cells', and go to this one thatsays 'Duplicate Values'. And what it's going to do is-- Click 'OK'. It's going to just highlight anything that's a duplicate.

So it's not just the whole row like it was when it was deleting before. Anything that's a duplicate here, it's going to highlight. And it kind of makes it a little bit obvious that-- not to worry about these times because obviously somebody else has got the same time as them. Somebody is also 'Unpaid'. And it's these lines here. I can see, there's two 'Haty Chavez'. And then I can kind of compare and go-- they're exactly the same except for this one.

So what might have happened in this fun run is that, maybe Haty and her, or his daughter ran as well. And they ended up being classed as the same person with different times, or maybe, just-- I have no idea. Don't need to know.

It's just a really handy way to kind of double check before you go off and commit to things. I'm going to 'undo that' and get rid of my styling. So what you can do to get around that just to kind of make your deleted duplicate content a little bit more robust than our first option, I'm going to select all again, go back to 'Data'. Go along here where it says 'Remove Duplicates', and instead of saying "I want to compare all of these column headings,” and find something that matches across all of them.

I'm going to unselect them all, and I'm just going to say "I want any rows that have the same email,” because I feel that's really unique. If you have the same email as somebody else, you're the exact same person. So you might have something else-- that's the date of entry in here, you might be a SocialSecurity Number, Inland Revenue document date, or something like that. So, client ID--

So 'email address', I'm going to click 'OK'. And it's found, those are the two, so it's removed 'Haty'. And there was another data option in there so it's removed the entire column. Which one? It's removed the second one. So we had two for Haty, it's just removed the second one. So be careful when you arejust randomly deleting. Make sure you delete the stuff you want to.

All right, that is deleting things, duplicates even, from your Excel sheet. Let's go into the next video andtidy up this list even more. How much more tidy could it get? A little bit more.