Friday, 15 February 2013

r - Fixing dates that were coerced into the wrong format -


I have large dates whose fault was wrongly attached.

Data:

  id & lt; - c (1:12) Date & lt; - c ("2014-01-03", "2001-08-14", "2001-08-14", "2014-06-02", "2006-06-14", "2006-06-14" , "2014-08-08", "2014-08-08", "2008-04-14", "2009-12-13", "2010-09-14", "2012-09-14") DF & Lt; - Data. Frame (ID, Date)  

Structure:

  Date of ID 1 1 2014-01-03 2 2 2001-08-14 3 3 2001-08-14 4 4 2014-06-02 5 5 2006-06-14 6 6 2006-06-14 7 7 2014-08-08 8 8 2014-08-08 9 2004-04-14 10 10 -12-12-12 11 11 2010-09-14 12 12-09-2014  

The data set is included only, or should only 2014 and 2013 will contain the date 2001-08-14 and 2006-06-14 most likely 2014-08-01 and 2014-06- 06 , respectively

Output:

  Date of ID 1 1 2014- 01-03 2 2 2014-08-01 3 3 2014-08-01 4 4 2014-06-02 5 5 2014-06-06 6 6 2014-06-06 7 7 2014-08-08 8 8 2014-08 -08 9, 2014-04-08 10 10-12-12-12 11 11 2014-09-10 12 12 2014-09-12 How can I solve this mess?   

We can convert the 'date' column to 'date' category, year 2013, 2014 To create a logical index ('index'), remove 'year').

  df $ date & lt; - as.Date (df $ date) indx & lt; -! By using% 2013: 2014  

lubridate in the format (df $ date, '% Y')% 2013, after removing the first two characters Dmy to convert to 'date' orbit.

  Library (LibreDet) df $ date [indx] & lt; - dmy (sub ('.. ..', '', df $ date [indx])) df # id date # 1 1 2014-01-03 # 2 2 2014 -08-01 # 3 3 2014-08- 01 # 4 4, 2014-06-02 # 5 5, 2014-06-06 # 6 6, 2014-06-06 # 7 7, 2014-08-08 # 8 8 2014-08 -08 # 9 9 2014-04-08 # 10 10 2013-12-09 # 11 11 2014-09-10 # 12 12 2014-09-12  

No comments:

Post a Comment