The data centric conference
in Paris

DataXDay is a technical conference for enthusiasts and professionals from the world of data.

Big or small data, real-time or batch processing, classical machine learning or deep learning… As developers these are some of the core subjects that keep us up at night.

Are you a data pro? Come listen to experts who work on data exploitation and related technologies.

200 Attendees

Come and meet more than 200 data lovers like yourself and let's talk data!

For developers and Tech Leads

Our goal is to dive deeply into some of the hottest technical subjects at the crossroad between data science, data engineering, cloud computing, craftsmanship and security.

One Day

On May 17th, spend your day enjoying talks from renowned speakers and sharing your ideas on what the future holds.

DataXDay will cover the following topics:

Program

09:25 - 09:30

Kevin Nelson
Developer Advocate

Thanks to machine learning and AI, applications are now being created that can see, hear, and understand the world around them. Learn how you can easily infuse AI into your business today. In addition to a guided walkthrough of easy-to-use machine learning APIs from Google Cloud: Cloud Vision, Cloud Video Intelligence, Cloud Speech, Cloud Natural Language, and Cloud Translation, we'll demonstrate how Google Cloud AutoML enables developers with limited machine learning expertise to train high quality models by leveraging Google’s state of the art transfer learning, and Neural Architecture Search technology.

Florent Ramière
Technical Account Manager

The Kafka ecosystem goes way beyond the brokers: Kafka Connect, Kafka Stream and KSQL are amazing tools!
I propose to walk you through the implementation of all these components with a focus on streaming and monitoring.
Come Join me to learn how to leverage Kafka to put your data in motion!

11:10 - 11:35

Pierre Villard
Solution Architect

Apache NiFi provides a revolutionary data flow management system with a broad range of integrations with existing data production, consumption, and analysis ecosystems, with robust data delivery and provenance infrastructure. This talk will mainly focus on how to deal with workflows lifecycle.

Adrien Morvan & Cristina Oprean
Machine Learning Engineers

Photobox business is about pictures and derived products: we process 2 to 6 millions photos on a daily basis. To suggest adapted products to our customers we need to handle and better understand the content of their pictures.

Since the number of personal photos has greatly increased thanks to the development of digital cameras and smartphones, scalability is a must.
The goal of this presentation is to introduce our large scale automatic photo labelling pipeline.

12:50 - 14:00

Aurélia Nègre & Alberto Guggiola
Data Scientists

Ever been stuck in a data science use case where any approach seems too hard? Graph theory, describing a system just in terms of nodes and links, could be your answer! In the practical example we’ll show, we’ll try to find data science communities and their leaders in LinkedIn. Challenge accepted?

Jacek Laskowski

If you want to get even slightly better performance of your structured queries (regardless whether they are batch or streaming) you have to peek at the foundations of Dataset API starting with QueryExecution. That's where any query ends up at and my talk starts. The talk will show you what stages a structured query has to go through before execution in Spark SQL. I'll be talking about the different phases of query execution and the logical and physical optimizations. In the end, I'll do a live coding session to show the steps to write logical and physical optimizations in Scala.

15:40 - 16:05

Sylvain Friquet
Software Engineer

This talk will cover how we redesigned our analytics API from the ground up to serve metrics in near real time from billions of events per day. We'll go through the tools we considered for the job to how we actually implemented our solution, starting from the datastore up to the whole data pipeline and its API, leveraging Golang, Kubernetes, GCP and Citus.

Thomas Lamirault
Software Architect

At BlaBlaCar we have built a streaming platform to have fast insights about the usage of our services. I will show you how BlaBlaCar builds an automatic access log streaming analysis to improve the security and gain fine-grained knowledge of the platform usage.

17:30 - 18:00

18:00 - 19:15

09:25 - 09:30

Olivier Bergeret
Solutions Architect Manager

Build, train, and deploy machine learning models at scale

Machine learning often feels a lot harder than it should be to most developers because the process to build and train models, and then deploy them into production is too complicated and too slow.

Amazon SageMaker includes modules that can be used together or independently to build, train, and deploy your machine learning models.

Charles Ollion
Co-founder

Beyond the AI hype, significant new possibilities in the world of computer vision have arisen in the last few years. However, deploying computer vision solutions still requires expert vision knowledge, business understanding, solid engineering and smart processes. I’ll expose the challenges of computer vision applied to a vertical domain such as fashion, and how we solved them at Heuritech.

Ana Peleteiro Ramallo
Data Science Director

In recent years, deep learning (DL) has proven to be a transformative force that has made impressive advances in different fields. In fact, within the area of natural language processing (NLP), deep learning has outperformed many former state of the art approaches, such as in machine translation or named entity recognition (NER). In this talk I will present various deep learning algorithms and architectures for NLP, with examples of how they can be leveraged to real world applications

11:10 - 11:35

Sylvain Lequeux
Data Engineer

Out of curiosity, ask the other people in the conference room who has already developped neural networks: you will see a lot of hands up. Then ask them how many of those models run in production: epic fail.

Come and see a solution to train and deploy TensorFlow models in the cloud using Google CloudML.

Matthieu Blanc
VP Product

Data lineage is defined as a data life cycle that includes the data’s origin sand where it moves over time. It has become a crucial component of any data centric company, whether for documentation, regulatory compliance, data quality or business impact assessment. This talk will offer an overview of the different approaches to construct and visualize metadata and data lineage in a Big Data environment.

12:50 - 14:00

Pauline Ballereau & Nicolas Laille
Data Scientist & Data Engineer

Join the journey of a data scientist on the way to industrialization... From notebook to proof of concept, from proof of concept to production, we will cover what happened at Air France. It won’t be golden rules, but a true story. What is exactly industrializing data science? How to package data science models? How to articulate data scientists and data engineers roles? Is continuous integration a wild dream for data scientists? This journey will feed you with key concepts which worked at Air France, and might give you a new light to guide you through your own data science journey.

Samah Ghalloussi
Data scientist, Entrepreneure

I tested several platforms for creating chatbots with the objective of simulating a patient coming to the emergency room so that medical students could ask questions to establish a diagnosis.

The major advances in the field of Natural Language Processing and Artificial Intelligence have seen the emergence of chatbot platforms to develop your own agent from a web service.

I will present 4 platforms from major technology companies offering their service in French.

Pablo Lopez & Pierre Sendorek
CTO & Data Scientist

During DataXDay, you'll hear a lot about machine learning and deep learning. But sometimes, combining those advanced techniques with a more ``traditional`` approach can enhance results in a spectacular way. See how, a data scientist and a software engineer, we managed to build an identity card recognition API.

15:40 - 16:05

Vincent Poncet
Solutions Engineer

Millions of people, objects and ‘things’ connecting with each other is changing the way organisations and consumers interact with each other and the environment around them. Data comes from different geographical locations and across multiple channels. Managing this explosion of high velocity dynamic data while maintaining customer privacy is a challenge with legacy systems.

Samya Barkaoui & Pierre Schmidt
Head of Data & Lead backend developer - Toucan-Toco

Data Scientist, just like their ancestors, Statisticians and Computer Scientists work on notoriously complex subjects with advanced methods... yet their expertise and their practices have a growing impact of everyone lifes. We aim to demonstrate that data storytelling, its concepts and tools are key to the future of data science because of it's power to tell about complex data insights to everyone.

17:30 - 18:00

18:00 - 19:15

Subject to modifications

Talk in French
Talk in English
Keynote & Break

Speakers

Charles Ollion - Heuritech
Deep learning for vision into the wild

Co-founder - Heuritech

Charles Ollion, CoFounder @Heuritech, startup specialized in Deep Learning and Computer Vision. Charles Ollion is a PhD in machine learning and teaches deep learning at Ecole Polytechnique / EPITA

Charles Ollion - Heuritech
Deep learning for vision into the wild
Sylvain Friquet - Algolia
Building a Real Time Analytics API at Scale

Software Engineer - Algolia

Sylvain is a software engineer passionate about large scale infrastructures. He is currently working on the Analytics feature of Algolia. Previously, he was CTO for a biotech startup and a software engineer at Facebook, where he worked on graph search and ads product like Slideshow Ads.

Sylvain Friquet - Algolia
Building a Real Time Analytics API at Scale
Vincent Poncet - DataStax
How to get real-time value from your IoT data?

Solutions Engineer, DataStax

Vincent has over 13 years of experience in the IT industry. He went from SOA, ESB to MDM and finally to Big Data, from Hadoop to Cassandra. He worked as a consultant in his early years and then became a software presale. Vincent works at DataStax for about 2 years where he helps his customers to embrace NoSQL and Big Data technologies.

Vincent Poncet - DataStax
How to get real-time value from your IoT data?
Florent Ramière - Confluent
Kafka beyond the brokers: Stream processing and Monitoring

Technical Account Manager, Confluent

He is a technical account manager for Confluent. His job is to sit with customers and help them succeed with Kafka, so he knows a thing or two about Kafka.

Florent Ramière - Confluent
Kafka beyond the brokers: Stream processing and Monitoring
Alberto Guggiola - Quantmetry
Exploring graphs: looking for communities & leaders

Data Scientist - Quantmetry

Alberto earned a PhD in theoretical physics for his work on rare events taking place on graphs, and since then he tries to convince anybody (clients, colleagues, relatives, people on the street) of the added value of this approach.

Alberto Guggiola - Quantmetry
Exploring graphs: looking for communities & leaders
Adrien Morvan - Photobox
Transforming pictures into memories

Machine learning engineer - Photobox

Adrien is a ML engineer at Photobox. He worked on different subjects of computer vision like simultaneous localisation and mapping and face recognition. He now focuses on topics like recommendation to deliver automated solutions at scale.

Adrien Morvan - Photobox
Transforming pictures into memories
Jacek Laskowski
The internals of query execution in Spark SQL

Software Developer

Jacek is an independent consultant, software developer and technical instructor specializing in Apache Spark, Apache Kafka and Kafka Streams (with Scala, sbt, Kubernetes, DC/OS, Apache Mesos, and Hadoop YARN). He offers software development and consultancy services with very hands-on in-depth workshops and mentoring.

Jacek Laskowski
The internals of query execution in Spark SQL
Samya Barkaoui - Toucan-Toco
Visualizing algorithms

Head of data, Toucan-Toco

``Samya achieved 6 years of experience in the data. She started by working in a consulting company specialized in datascience. She studied at the French engineering school 'Ecole des Mines de Paris' and specialized herself in statistics. She supervises statistics projects with ENSAE students. ``

Samya Barkaoui - Toucan-Toco
Visualizing algorithms
Thomas Lamirault - BlaBlaCar
Real-Time Access log analysis

Software Architect - BlaBlaCar

Software Architect Data at BlaBlaCar, he has been in the IT industry for 11 years. Other than being a passionate Java developer, he worked as a data engineer for Ericsson and Bouygue Telecom. He now brings his passion and experience for Flink and Beam to build the next data platform at BlaBlaCar.

Thomas Lamirault - BlaBlaCar
Real-Time Access log analysis
Pauline Ballereau - Air France
A data scientist journey to industrialization of machine learning

Data Scientist, Air France

Pauline is a data scientist at Air France-KLM. She is crurently working on recommender systems and digital analytics projects. She holds a MS degree in data science and operations research

Pauline Ballereau - Air France
A data scientist journey to industrialization of machine learning
Matthieu Blanc - Zeenea
Data lineage: visualize the data life cycle

VP Product - Zeenea

Matthieu has a data architect background. He has co-founded Zeenea in 2017. The company edites a data catalog connected to the Big Data systems. It centralizes all these data and metadata to provide a self-service and collaborative data solution. Matthieu is the VP Product of Zeenea.

Matthieu Blanc - Zeenea
Data lineage: visualize the data life cycle
Aurélia Nègre - Quantmetry
Exploring graphs: looking for communities & leaders

Data Scientist - Quantmetry

Aurélia Nègre has a background in statistics, and prior to working at Quantmetry, she worked at the French Central Bank where she designed and implemented credit risk models for structured products.

Aurélia Nègre - Quantmetry
Exploring graphs: looking for communities & leaders
Ana Peleteiro Ramallo - Tendam
The wonders of deep learning: how to leverage it for natural language processing

Data Science Director, Tendam

I am currently the Data Science Director at Tendam, where I lead the data science initiatives in the company. Prior to that, I was a Senior Data Scientist at Zalando, where I built data-driven products that provided fashion insights using Machine Learning and Deep Learning. I hold a PhD in Artificial Intelligence, and I have 30+ international peer-reviewed publications, as well as having spoken at 10+ international conferences. I am a firm advocate of knowledge sharing, as well as promoting women in tech initiatives.

Ana Peleteiro Ramallo - Tendam
The wonders of deep learning: how to leverage it for natural language processing
Cristina Oprean - Photobox
Transforming pictures into memories

Machine learning engineer - Photobox

Cristina is working as R&D engineer in machine learning @Photobox. She earned a PhD focused on handwriting recognition from Telecom ParisTech. Her current work is centered around adapting and applying the state of the art in computer vision and recommendation for Photobox innovative products.

Cristina Oprean - Photobox
Transforming pictures into memories
Pablo Lopez - Xebia
Computer vision: a pragmatic alliance between deep learning and a more ``traditional`` technique.

CTO - Xebia

Pablo has a strong knowledge in software developement and is passionate about technology in general. He is always eager to discover new fields of interests. Therefore, pairing with a DataScientist was for him a way to set foot in a new playfield.

Pablo Lopez - Xebia
Computer vision: a pragmatic alliance between deep learning and a more ``traditional`` technique.
Pierre Schmidt - Toucan-Toco
Visualizing algorithms

Lead backend developer, Toucan-Toco

``Pierre has 10 year experience working as a data engineer. Pierre started working in the field of industrial manufacturing automation software in London. He later caught up, in Paris, with data engineering and distributed systems for web applications at Sen.se and Deezer. He studied at Ecole Normale Supérieure where he focused on logic and its applications in computer science.``

Pierre Schmidt - Toucan-Toco
Visualizing algorithms
Kevin Nelson - Google
A crash course on Google Cloud AutoML and machine learning APIs

Google Cloud - Developer Advocate, Google

He is a Google Cloud Developer Advocate focused on storage and machine learning. Before joining the Cloud team, he was a lead Product Manager on Google Drive. Prior to joining Google in 2014, he was an entrepreneur with over 20 years of experience building and managing software and SAAS companies. In addition to working at Google, he sits on the board of Quantum Scientific Imaging, a company he co-founded which designs and manufactures scientific CCD cameras for applications that require superior image performance such as astronomical and medical imaging.

Kevin Nelson - Google
A crash course on Google Cloud AutoML and machine learning APIs
Nicolas Laille - Xebia
A data scientist journey to industrialization of machine learning

Data Engineer, Xebia

I started as a back end developer before diving deep down in big data as a data Engineer. I am now helping Air France to industrialize their on data sciences projects.

Nicolas Laille - Xebia
A data scientist journey to industrialization of machine learning
Pierre Villard - Hortonworks
How to deal with workflows lifecycle in Apache NiFi?

Solution Architect - Hortonworks

Involved in the Apache NiFi community since 2015, he is a committer and PMC member of the Apache NiFi project. Sincelery convinced by the open source software model, he is also a Solution Architect at Hortonworks since 2016.

Pierre Villard - Hortonworks
How to deal with workflows lifecycle in Apache NiFi?
Sylvain Lequeux - Xebia
Tensors in the sky with CloudML

Data Engineer - Xebia

Sylvain is Data Engineer at Xebia. He dispenses Cloudera Administrator and Machine Learning with Spark trainings. He is certified Cloudera Developper and is a Software Craftsmanship enthousiast.

Sylvain Lequeux - Xebia
Tensors in the sky with CloudML
Olivier Bergeret - AWS
Machine learning models at scale with Amazon SageMaker

Solutions Architect Manager, AWS

Solutions Architect Manager and Data/AI specialist at AWS, Olivier is the creator of two 1-day discovery workshops dedicated to AI and Big Data, and the author of some contents for SageMaker, the managed platform that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. Before joining AWS, Olivier was the CTO of lacentrale.fr and the Dev Manager and previously a developer for many startups and medium media companies during the past 18 years.

Olivier Bergeret - AWS
Machine learning models at scale with Amazon SageMaker
Samah Ghalloussi - Ministère des solidarités et de la santé
Enhancing medical student practice with patient-like chatbots

Data scientist / Entrepreneure, Ministère des solidarités et de la santé

Samah worked for 3 years at the French Atomic Energy Commission (CEA) in the Natural Language Processing Lab where she contributed to several Machine Learning projects as well as the creation of chatbots. Then, she joined the start-up Stryng Messaging Inc. in June 2017 to add Artificial Intelligence to this new messaging app dedicated to professionals. She is now data scientist for the French Ministry of Health as a Public Interest Entrepreneur

Samah Ghalloussi - Ministère des solidarités et de la santé
Enhancing medical student practice with patient-like chatbots
Pierre Sendorek - Xebia
Computer vision: a pragmatic alliance between deep learning and a more ``traditional`` technique.

Data Scientist - Xebia

Pierre Sendorek is passionate about machine learning and signal processing. He holds a PhD in signal processing and a master in applied mathematics as well as an engineering diploma. Currently, he is working as a Data Scientist at Xebia.

Pierre Sendorek - Xebia
Computer vision: a pragmatic alliance between deep learning and a more ``traditional`` technique.
Coming soon
Data Lover

Coming soon
Data Lover

Contact & access

The venue is located in the 11th arrondissement of Paris, a few steps from Metro Station Philippe Auguste on Line 2 or a 5-minutes walk from Metro Station Charonne on Line 9.

COME TO THE CONFERENCE

PAN PIPER - 4 impasse Lamier, 75011 Paris

Brought to you by

Sponsors

Sponsor gold

Confluent_Logo_RGB-preferred for digital
saagie-logo-red-500

Sponsor silver

With the support of

Girls_in_tech_Paris_logo

Become sponsor

Contact us