Data Science and Artificial Intelligence have become well-known terms over the past few years. However, it’s still sometimes difficult to know exactly what they mean, how they work and what people who work in the fields do.
To gain a better understanding of the areas, we published a post asking which questions you would like to ask Diogo Sousa, Advertio’s Data Scientist. Here are your questions and his answers.
1. Firstly, what made you decide to pursue a career in Data Science?
At University, I had many classes that sparked my interest in Data Science. I particularly remember two key moments that really helped me decide to pursue a career as a Data Scientist.
The first moment was a project for a University class based on the old Snake game. This particular version, however, had a few singularities: first, there were two snakes instead of only one and second, one of them used intelligence created by us. Furthermore, the snakes could communicate with each other, which for me was really the icing on the cake. Playing such a simple game, which I had played countless times when I was younger, with an added little piece of technology created by me, was definitely an awesome experience.
The second was my Master’s dissertation in which I had the opportunity to create a Machine Learning project from scratch. This project was what really motivated and allowed me to implement all the necessary stages to conclude a Data Science project.
2. What’s the most exciting part of your job?
I would say there are two main aspects that I really like about working in Data Science.
I love the beginning of a new project. Every time I start working on something new I feel thrilled because I know I’ll have to conceptualise what I can do in order to “make a machine be smarter” with the data I have. This is exciting because it allows me to think about different ways to do this and test different methods for getting better results.
The second aspect, which is closely related to the first, is when I actually get those results back and they’re positive. I feel great when, after teaching a machine, it can actually suggest and provide good results. It’s a “mission accomplished” feeling.
3. Do you need to be aware of the data privacy you use as input for the Machine Learning algorithms? If so, do you follow some standard guideline (GDPR, etc. )?
Yes! Before using machine learning algorithms in a Data Science project, the first step that needs to be done is to clean and explore the available data.
In this step, if there’s any data that doesn’t respect privacy policies, it needs to be removed or masked in a way that respects them. In my experience at Advertio, and up until now, I haven’t had any issues with sensitive data but, regardless of this, I need to always be careful and follow privacy policies
4. What compromises are you willing to take in order to get the best guesses in the shortest time possible? Timeouts? Smaller data-sets? Lower acceptable limits (higher probability of providing worse results)?
I would say there are two different scenarios for this question.
The first one contemplates the algorithms and services that need to run when one of our platform users performs an action. In this situation, yes, there are many compromises to be made.
Advertio is providing a service and no client wants to wait that long to use the app, so we have to use timeouts that don’t allow the system to run any longer. Logically, more time usually means better results but if the user gets tired of waiting and leaves the platform, all efforts would have been in vain.
The second scenario considers all the work that needs to be done to prepare the results for the user, such as training all the necessary models to get good results.
In this case, there aren’t a lot of compromises to be considered, because it’s something that doesn’t have a time limit. In these situations, I generally try to train with as much data as possible to get better results.
Do you have more questions regarding Data Science? Let us know!