Excel Remove Duplicates based on time condition

0 votes
A 3 month rolling time condition is the basis for my search for duplicates to be eliminated from a 250,000-row Excel sheet.

We have a lot of user IDs and the dates on which they visited, however many of these visits are spread out over long periods of time (often more than a year), and many of them occur on the same day or a few days.

An example is the greatest method to illustrate what I want to do. So if they came to visit on January 1, January 1, January 3, January 8, February 4, June 5, December 1, and December 2, I would want to use the first date of January 1, June 4, and December 1.

If they came on January 1, January 2, January 3, February 8, or April 9, then on August 1 or September 1, I would like January 1 and August 8.

As a result, we want to book the initial date, then observe how frequently they return within 3 months of each visit, and if they go more than 3 months without coming back, book the first date they have after they do. After three months, they occasionally return four or five more times, and the data may cover several years.

Is there a way I can accomplish this?
Nov 2, 2022 in Others by Kithuzzz
• 38,000 points
342 views

No answer to this question. Be the first to respond.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

Try this (description below) if the UserID is in column A, the VisitDate is in column B, the headings are in row 1, there is a blank row in row 2, and the data begins in row 3:

Array Formula version:

  • sort the rows ascending by VisitDate
  • in B2 put 1/1/1900 so it won't match anything (but it has to be a date)
  • in C3 put this array formula (press control-shift-enter instead of just enter):
    • =SUM((B$2:B2<DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)))*(A$2:A2=A3))=SUM((A$2:A2=A3)*1)
  • Copy the formula in C3 down to every row of data
  • Filter on Unique = TRUE
  • if you want to resort you will need to copy and paste back column C by values

New non-array formula version:

  • sort the rows ascending by VisitDate
  • in B2 put 1/1/1900 so it won't match anything (but it has to be a date)
  • in C3 put this normal formula (just press enter):
    • =COUNTIFS(B$2:B2,"<"&DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)),A$2:A2,A3)=COUNTIF(A$2:A2,A3)
  • Copy the formula in C3 down to every row of data
  • Filter on Unique = TRUE
  • if you want to resort you will need to copy and paste back column C by values

This produces the following with my sample data (array formulas may take a very long time to calculate for lots of rows):

   |    A   |      B     |   C
---+--------+------------+--------
 1 | UserID |  VisitDate | Unique
 2 |        |  1/01/1900 | 
 3 | a      |  1/01/2017 | TRUE
 4 | a      |  1/01/2017 | FALSE
 5 | b      |  2/01/2017 | TRUE
 6 | b      |  2/01/2017 | FALSE
 7 | a      |  3/01/2017 | FALSE
 8 | c      |  3/01/2017 | TRUE
 9 | c      |  3/01/2017 | FALSE
10 | b      |  4/01/2017 | FALSE
11 | c      |  5/01/2017 | FALSE
12 | a      |  8/02/2017 | FALSE
13 | b      |  9/02/2017 | FALSE
14 | c      | 10/02/2017 | FALSE
15 | a      |  4/06/2017 | TRUE
16 | a      |  5/06/2017 | FALSE
17 | b      |  5/06/2017 | TRUE
18 | b      |  6/06/2017 | FALSE
19 | c      |  6/06/2017 | TRUE
20 | c      |  7/06/2017 | FALSE
21 | a      |  1/12/2017 | TRUE
22 | a      |  1/12/2017 | FALSE
23 | a      |  2/12/2017 | FALSE
24 | b      |  2/12/2017 | TRUE
25 | b      |  2/12/2017 | FALSE
26 | b      |  3/12/2017 | FALSE
27 | c      |  3/12/2017 | TRUE
28 | c      |  3/12/2017 | FALSE
29 | c      |  4/12/2017 | FALSE

I hope this helps you. 

answered Nov 4, 2022 by narikkadan
• 63,600 points

edited Mar 5

Related Questions In Others

0 votes
1 answer

How To Copy/Cut Row of Data Based on TRUE/FALSE Condition [Excel VBA]

Solution Loop through the rows on the Price ...READ MORE

answered Feb 4, 2023 in Others by narikkadan
• 63,600 points
1,023 views
0 votes
0 answers

get unique distinct items based on a condition and a date condition

hi could you help me sorry my ...READ MORE

Nov 23, 2021 in Others by adolpo
• 120 points
456 views
0 votes
1 answer

datatable remove column on export to pdf and excel

When we are using jquery datatable for ...READ MORE

answered Feb 17, 2022 in Others by gaurav
• 23,260 points
4,475 views
0 votes
1 answer

Excel formula for searching two text in one cell and return values based on the result

You can include a second IF within ...READ MORE

answered Oct 10, 2022 in Others by narikkadan
• 63,600 points
1,321 views
0 votes
1 answer

Excel Conditional Formatting based on Adjacent Cell Value

The row number used in the formula ...READ MORE

answered Oct 10, 2022 in Others by narikkadan
• 63,600 points
2,068 views
0 votes
1 answer

Sort Excel worksheets based on name, which is a date

Sorting sheets of a workbook are rather ...READ MORE

answered Oct 17, 2022 in Others by narikkadan
• 63,600 points
644 views
0 votes
1 answer

Excel How to Remove Duplicate Rows in multiple of 3 Same Value

Put the following formula into a helper ...READ MORE

answered Dec 28, 2022 in Others by narikkadan
• 63,600 points
655 views
0 votes
1 answer

Numbering/sequencing sets of same column values

You can accomplish this with countif and a sliding ...READ MORE

answered Jan 3, 2023 in Others by narikkadan
• 63,600 points
781 views
0 votes
1 answer

Excel If duplicate agree paste the value

With Office 365: =LET( nm,A2:A5, ...READ MORE

answered Jan 19, 2023 in Others by narikkadan
• 63,600 points
616 views
0 votes
1 answer

Retrieve epay.info Balance with VBA and Excel

This code should log you in, provided ...READ MORE

answered Sep 5, 2018 in Blockchain by digger
• 26,740 points
1,473 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP