profile image

Estelle Scifo

(Graph) Data Scientist & Python developer

About me

You can find more information about my professional experience on my linkedin or stackoverflow profiles but here are the highlights.

Summary

I have completed a PhD in particle physics in 2014 during which I was part of the exciting ATLAS experiment at CERN. Analyzing billions of data events, we were able to show the existence of a long searched particle, the Higgs boson. Its discovery leads to the 2013 Nobel prize in Physics for the theoreticians who predicted the existence of this particle.

After this PhD, I decided to move to industry and explore other fields, still through data analysis and mathematical modeling that helps solve real-life problems. A few years later, I discovered graphs through the world of graph databases, especially Neo4j, and couldn’t stop reasoning in terms of graphs since then.

In 2021/2022, I have been leading the technical development of SmartGrid, a Canadian startup whose goal is to create a mirrorverse, a digital twin of the real world including peaople and their relationships with the digital assets. I have built the first PoC with a Python backend, a small demo frontend (ReactJS) and deployed everything on GCP.

Since 2022, I’ve joined GraphAware as a Senior Machine Learning Engineer, working on exciting applications of graph-related problems.

Tools

  • Data science & machine learning: python data analysis toolkit ie: numpy, scipy, pandas, scikit-learn
    • Create the model (mathematical formulation)
    • Implementation (efficient code)
    • Validation with simulations (Monte-Carlo techniques)
    • Visualization (including maps and graphs)
  • Graph data analysis:
    • Neo4j graph database (certified): create the data model that best suits your use-case
    • Graph algorithms: extract information from your graph data
    • Machine learning on graphs: apply machine learning predictive methods on graph data (graph embeddings…)
  • Web development: django, HTML, CSS, JS; webservices (RPC, REST, GraphQL)

  • Database: graph databases (Neo4j, arangoDB), SQL (Postgresql, Postgis)

  • Development: Python (including testing and packaging), ReactJS. I am also starting to build some stuffs with Rust and Flutter.

  • Cloud: AWS (S3), Google Cloud (Google App Engine)

Achievements

neomap screenshot

  • 2018-12: Neo4j Certified Professional
  • 2018-10: second prize at the Lux4Good hackathon (team made of 4 persons):
    • Project about measuring diversity within company’s employees (gender, age, disability…)
    • Development of a django website from scratch, visualization with D3.js: github repository

Publications

  • 2023: ‘Graph Data Science with Neo4j’ book: WIP

  • 2020-09: ‘Hands-on Graph Analytics with Neo4j’ book at Packt Publishing. Available on Amazon.

Book cover

  • 2019-05: ‘Exploring Graph Algorithms with Neo4j’ video course at Packt Publishing. Available here.

Course cover

Talks

  • 2022-11: NODES 2022 Annual online Conference organized by Neo4j: (I was a featured speaker ^^)
    • Building a Neo4j/Python Object Graph Mapper step by step guide leveraging Cypher map projection and Python dynamic typing
    • Recording available here: https://youtu.be/DKziks5jQvc
  • 2020-11: I was invited by the WinDSML Meetup group from Bussels to give a talk about:
    • Graph Databases for Machine Learning: Youtube
  • 2020-10: NODES 2020 Online Conference organized by Neo4j:
    • Extending a Knowledge Graph from Wikidata: Youtube
  • 2019-10: PyConFR in Bordeaux, France (Content is in French):

Work Experience

  • From 2022: Senior Machine Learning Engineer at GraphAware, neo4j partner

  • 2021-2022: Technical co-founder at SmartGrid (remote)
    • Built solution prototype using Python & Rust for the backend, ReactJS for front-end, deployed on GCP
  • 2020-04 / 2021-07: Data scientist & backend developer at Deleev (www.labellevie.com), dark grocery mainly operating in Paris, France:
    • Logistics and automation through operation research tools (delivery routes and order dispatch on available drivers based on business constraints)
    • Code migration to newer package versions (including Python (3.4 => 3.8) and Django (1.9 => 2.2))
  • 2011 / today: Teaching
    • 2018- : remote mentor (python, neo4j, data science, sql)
    • 2014- : private lessons, maths and physics, preparation for the French A-level equivalent
    • 2011-2014: Python practical works for undergraduate students
  • 2018-10 / 2020-03 : Lead Data Scientist at Motion-S (www.motion-s.com) (startup in the mobility area, Luxembourg):
    • Statistical models for driver road accident risk estimation from driver behaviour, contextual and road statistics data
    • Road network model using graph technologies
    • Dashboards to present results to prospects (VUE.js)
  • 2014-12 / 2018-09 : Data Scientist at effiCity (www.efficity.com/) (real estate company, France) (working remotely):
    • Real estate estimation algorithm: 30% improvement compared to existing (estimation accuracy)
    • Mathematical formulation with parameter fitting on real data
    • Implementation with the python stack (numpy, pandas, scipy, sklearn)
    • Validation with simulations
    • Push to production (python package, model persistence)
    • Statistics:
    • Regression algorithm (KNN)
    • Function minimization (scipy)
    • Clustering to identify cities with similar caracteristics
    • As always, internal note and code documentation
    • Participation to the team projects especially web development (django), including CSS integration and JS (jQuery) scripts
  • 2011-2014: PhD in Particle physics (Université Paris Sud/LAL, Orsay, France)
    • Search for the Higgs boson and, after its discovery, measurement of some properties (called couplings)
    • Hypothesis testing (likelihood) and best parameter estimation with the C++ ROOT framework
    • Analyze billions of events on distributed data analysis system (Big data)