Question

This is an R programming question:

I am trying to improve the line of code 59:

date_movie<-stringi::stri_extract(edx$title,regex = "\\d{4}", comments=TRUE)%>%as.numeric()

It is supposed to extract the date from the tittle, but as some titles do have numbers before the date, I currently need to fix some of the dates "manually" in lines 74 to 87.
My request is to help me modify the code so I can extract the date more systematically (maybe through a logical expression) when a number appears first in the title.

Thank you :)

RStudio File Edit Code View Plots Session Build Debug Profile Tools Help O O O O O 23. ML Caret package.R- 1. Capstone R- 2.

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Answer

Solution 1

You can rearrange your regular expression to read the years correctly.

Instead of regex = "\\d{4}", you can use regex = "(189)[6789] | (19)\\d{2} | (20)[01]\\d{1})" to eliminate the other options as year.

(189)[6789] will include the years 1896,1897,1898 and 1899.

(19)\\d{2} will include the years 1900-1999.

(20)[01]\\d{1} will include the years 2000-2019.

Note that regex = "\\d{4}" reads any 4 digit integer.

Solution 2

If the year is always at the end of the title string surrounded by brackets as given in the data, you can extract the year sub-string and convert it to numeric.

The code to do that is

date_movie <- as.numeric(stringi::stri_sub(edx$title,-5,-2))

Add a comment
Know the answer?
Add Answer to:
This is an R programming question: I am trying to improve the line of code 59: date_movie<-stringi::stri_extract(edx$...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
Active Questions
ADVERTISEMENT