Dashboard Week: Webscraping

by Lauren Halliwell

Today we were given the task to take data of the teams and games (including player statistics and match information) from the NRL website.

The difficulty came from the fact that the website used JavaScript, rather than being fully HTML and this meant that the data was in a different layout than I am used to. Instead of being under the objects on the page, they were placed in a 'q-data' section in the code as a massive block of text. When copying the DownloadData column into VS Code, this block was easily missed as it presented itself as a single line where you could not see the data underneath without looking further.

In order to get the data from this, I had to split the large q-data code, into smaller pieces - one containing the team data, one including player statistics and one including match information. I parsed each section making full use of tokenize and parse in order to get name and value pairs, then cross-tabbing them into entire data sets ready for use.

I enjoyed this task as I like the problem-solving nature of webscraping and I am looking forward to seeing how this is used during our final day of dashboard week tomorrow.

Avatar

Lauren Halliwell

Fri 25 Nov 2022

Wed 23 Nov 2022

Tue 22 Nov 2022

2 mins read

Wed 05 Oct 2022