[vc_row type=”in_container” full_screen_row_position=”middle” scene_position=”center” text_color=”dark” text_align=”left” top_padding=”60″ overlay_strength=”0.3″ shape_divider_position=”bottom” bg_image_animation=”none” shape_type=”mountains”]
For over a year Claudia has been working as a data scientist at Data Science Lab.
Her hobbies are indoor soccer, bodypump and Formula 1.
She also enjoys eating out, Spanish tapas being her favorite!
[divider line_type=”No Line” custom_height=”60″]
What did you do before starting at Data Science Lab?
In 2018, I graduated from the master’s program in Business Analytics at VU University.
During my master I worked as a data analytics consultant at QNH/Ilionx.
Daarna heb ik mijn afstudeerscriptie geschreven bij Van Amersfoort Racing, het oude Formule 3 team van Max Verstappen.
Het doel was om nieuwe racetalenten te vinden met behulp van data analysis and machine learning.
Met mijn masterdiploma op zak ben ik direct terechtgekomen bij Data Science Lab.
[divider line_type=”No Line” custom_height=”60″]
What are the duties as a “data scientist” at Data Science Lab and what do you find most interesting about this?
Most of the week I work for Port of Amsterdam.
There I am part of the Data team and we are responsible for everything to do with data.
Think for example of creating reports or Power BI dashboards or analyzing data and making predictions.
With the latter I am mainly concerned.
For example, I made an analysis of the crowding at waiting berths and the use of shore power by barges.
In addition, we have recently made several machine learning developed models that have also been put into production.
This allows us to predict how much cargo a ship will transship at the port, even before the ship has entered the port.
What I find most interesting about this is to see that outcomes of analyses and models provide direct value to the port.
This is because the outcomes are used to make decisions so that the processes in the port are even more efficient, safer and cleaner.
[divider line_type=”No Line” custom_height=”60″]
How do you perform the work in the current situation?
Fortunately, I can just continue my work from home.
I live alone and I have no children or pets, so that makes a difference ;).
During the day I have a lot of contact with my direct colleagues via MS Teams.
Besides work-related meetings, we don’t skip the (digital) Friday afternoon drinks.
A few weeks ago we even had a digital pub quiz done with all DSL colleagues and also the monthly TechDays just go on digitally.
Of course, I do miss the contact with colleagues such as a chat at the coffee machine.
Therefore, I hope to be able to physically catch up with everyone again soon, when the situation allows.
[divider line_type=”No Line” custom_height=”60″]
Within your current project, what is the biggest technical challenge?
The biggest challenge within my current project is dealing with large amounts of data (Big Data).
Recently we have been working on unlocking AIS data (position signals of ships in the port) to the data platform.
This is streaming data where sometimes as many as 200 messages per second have to be captured and processed.
We also want to perform all kinds of calculations on this data, such as determining whether a ship is at a berth.
To unlock this data, we first have to think carefully about the data architecture.
For example, some technologies are immediately dropped because they cannot process the large flow of data or because calculations take too long.
During this project, I also learned a lot about the different services available on the Azure platform from Microsoft, which ones are and are not suitable for big data and handling streaming data.
[divider line_type=”No Line” custom_height=”60″]
What do you think is the biggest misconception of data science?
I think the biggest misconception is to think that machine learning can provide a solution to all problems.
Of course, there are an awful lot of use-cases involving data science can provide a solution as long as sufficient data are available.
But when the data is of insufficient quality or even missing then the ‘ garbage in, garbage out” principle.
This holds true precisely for many events that we as a society want to predict.
Consider, for example, the outbreak of the coronavirus or predicting the next financial crisis.
No machine learning model could have predicted this pandemic because training data is simply lacking.
In addition, predicting the spread of the virus is difficult because the data on infections and deaths is incomplete.
[divider line_type=”No Line” custom_height=”60″]
How do you see data science in 10 years?
In 10 years, a data scientist I think will be more of a data science engineer than a programmer.
By that I mean that programming is unlikely to be more emphatic in the day-to-day work of a data scientist.
In plaats daarvan zal er veel meer gebruik gemaakt gaan worden van ‘off- the–shelf‘ data science solutions, such as pre-trained models.
Microsoft and Google already offer them.
A data scientist will be more concerned with tying the various processes together rather than developing the model itself.
[divider line_type=”No Line” custom_height=”60″]
What problem would you ever want to solve through data science?
Data science in the world of sports, especially Formula 1, has always interested me as a sports fan.
It is unimaginable how much data is generated per second by all the sensors on a Formula One car.
That, of course, is paradise for a data scientist like me.
Ideally, I would unleash my skills on this data to determine the best race strategy or perhaps a car to develop a car with which Max Verstappen can finally beat Mercedes.
[vc_row type=”in_container” full_screen_row_position=”middle” scene_position=”center” text_color=”dark” text_align=”left” top_padding=”90″ overlay_strength=”0.3″ shape_divider_position=”bottom” bg_image_animation=”none” shape_type=””]