Split lines into words – Sympathy for the devil

If you saw the movie Focus maybe you remember this moment at the stadium musicalized with this song and this line of the dialog

The Mandarin word for “five” is woo. There are 124 “woo-woos” in “Sympathy for the Devil.”

Next a little clip with the full explanation about the scam…

So, is this info accurate? let’s find out…

First, get the lyrics of “sympathy for the devil lyrics” (just google “sympathy for the devil lyrics” and copy the result into a text file), then load the file into QlikSense.

[Lyrics]:
LOAD
    rowno() as idline,
    [@1:n] as line
FROM [lib://Desktop/Sympathy For The Devil.txt]
(fix, codepage is 28591, no labels);

The result as follows

Then, how to split the “line” field into single words? The subfield function to the rescue!

Load
    idline,
    rowno() as idword,
    SubField(line,' ') as word
resident Lyrics;

without the 3rd parameter, the function creates a new row for every splitted value, creating this structure

Now, the count….

Why “who” is different from “who,”? Cleaning up the values…

Load
    idline,
    rowno() as idword,
    keepchar(upper(SubField(line,' ')),'ABCDEFGHIJKLMNOPQRSTUVWXYZ') as word
resident Lyrics;

The final result as follows..

So, if we take the count for “WOO” and “WHO” give us 83.. a little far from the 124 expected.

Happy Qliking!

Leave a Reply

Your email address will not be published. Required fields are marked *