← Back to list
Senior
Registration: 16.05.2025

Marcio Gualtieri

Specialization: Python Developer / Data Engineer / ML Engineer
— I am a Senior Full-Stack Developer with a strong focus on back-end development, data engineering, and machine learning. — My experience spans a wide variety of industries and company sizes, including advertising, finance, e-commerce, cybersecurity, and data analytics. — I've worked with both large corporations and startups. — In addition to my development expertise, I am a trained data scientist and have completed several certified specializations in data science and machine learning. — I adhere to clean code principles and am a dedicated practitioner of test-driven development (TDD). — As a lifelong learner, I enjoy expanding my knowledge by building projects. Industry Knowledge: — Full-Stack Development. — Back-End Web Development. — Data Engineering. — Object-Oriented Programming (OOP). — Data Science. — Extract. — Transform. — Load (ETL). — Natural Language Processing (NLP). — ETL Tools. — Big Data. — Cloud Infrastructure. — Web Development. — Front-End Development. — Machine Learning. — Agile Methodologies. — TDD. — Clean Code. — Generative AI. — Large Language Models (LLM). — Data Normalization. — Data Lake. — Data Marts. — Facs. — Dimensions.
— I am a Senior Full-Stack Developer with a strong focus on back-end development, data engineering, and machine learning. — My experience spans a wide variety of industries and company sizes, including advertising, finance, e-commerce, cybersecurity, and data analytics. — I've worked with both large corporations and startups. — In addition to my development expertise, I am a trained data scientist and have completed several certified specializations in data science and machine learning. — I adhere to clean code principles and am a dedicated practitioner of test-driven development (TDD). — As a lifelong learner, I enjoy expanding my knowledge by building projects. Industry Knowledge: — Full-Stack Development. — Back-End Web Development. — Data Engineering. — Object-Oriented Programming (OOP). — Data Science. — Extract. — Transform. — Load (ETL). — Natural Language Processing (NLP). — ETL Tools. — Big Data. — Cloud Infrastructure. — Web Development. — Front-End Development. — Machine Learning. — Agile Methodologies. — TDD. — Clean Code. — Generative AI. — Large Language Models (LLM). — Data Normalization. — Data Lake. — Data Marts. — Facs. — Dimensions.

Skills

Databases
R
Terraform
MySQL
PostgreSQL
MongoDB
Cascading Style Sheets (CSS)
React
Angular
SQL
NoSQL
Django REST Framework
FastAPI
SQLAlchemy
Pydantic
Airflow
Google Cloud Platform (GCP)
Docker
HTML
JavaScript
TypeScript
Bootstrap
Linux
SQL
Python
Amazon Web Services (AWS)
Git
LangChain
HuggingFace
TensorFlow
Torch
Scikit-learn
Deep Learning
LLM
Large Language Models
Generative AI
BigQuery
Redshift
Kubernetes

Work experience

Full-Stack Developer
03.2023 - 07.2024 |Finmatics
OOP, Python, Django, DRF, FastAPI, AngularJS, NodeJS, Bootstrap, JavaScript, TypeScript, Apex Charts, HTML, CSS, PostgreSQL, Elasticsearch, Kibana, GitLab, Kubernetes, Celery, Redis, Machine Learning, Helm, Kubectl, Jasmine, Pytest, Cypress, OCR, LLMs, NLP, OpenAPI/Swagger
● Finmatics leverages machine learning (specifically large language models, LLMs) to automate accounting tasks, such as processing paper invoices and predicting missing or ineligible information on invoices. ● Developed full-stack features for the SaaS accounting platform, involving both back-end and front-end changes. These features required a comprehensive understanding of the platform to ensure seamless integration across both layers. ● Performed extensive maintenance and defect fixing, addressing numerous issues introduced in previous releases, many of which were unrelated to my direct contributions.
Full-Stack Developer
07.2022 - 03.2023 |Silvr
OOP, Python, Django, DRF, PostgreSQL, Celery, Redis, Django Templates, Tailwind, JavaScript, Alpine.js, React.js, Node.js, HTML, CSS, Terraform, Jasmine, Playwright, Pytest, OpenAPI/Swagger
Silvr provides revenue-based financing for small companies, particularly in the e-commerce sector, catering to two types of users: businesses seeking financing and risk analysts who evaluate these companies' eligibility based on collected financial data (e.g., financial history, sales, and expenses). ● Developed full-stack features for Silvr’s revenue-based financing platform. Most of the features I built involved both back-end and front-end changes, requiring a deep understanding of both the transactional and analytical aspects of the platform. ● Performed maintenance and defect fixing, addressing numerous issues, the majority of which were introduced in previous releases and unrelated to my work.
Senior Back-end Developer & Data Engineer
09.2021 - 07.2022 |Marketer
OOP, Python, Django, DRF, Flask, Kubernetes, PostgreSQL, Airflow, Pandas, NumPy, GitLab (Auto DevOps), Google Cloud, JSON, SOAP, SQL, ETL, ELT, Data Modeling, Normalization, Facts, Dimensions, Star Schema, Snowflake Schema, Machine Learning, Pytest, Helm, Kubectl, OpenAPI/Swagger
Marketer leverages data analytics and machine learning to optimize real estate advertising. ● My team, the Data Science team, is responsible for ETL/ELT processes involving real estate data, as well as machine learning models that provide predictions (e.g., selling price, advertising impressions, and clicks). ● We also distribute real estate data and insights across the company via REST APIs, which are used for developing web and mobile applications. ● Developed the real estate data distribution back-end (Django, DRF) using microservices architecture. The API integrates data from multiple sources, including machine learning prediction APIs and data pipelines. ● Built Airflow DAGs to parse and store real estate data from customer CRM systems. ● Managed Kubernetes clusters and handled DevOps responsibilities, including CI/CD implementation via GitLab's Auto DevOps (auto-deploy, Helm). ● Developed RabbitMQ consumer services to sync real estate data from the REST API in real-time. ● Implemented end-to-end testing, covering the entire data flow from consumption to distribution via the REST API.
Data Engineer
04.2021 - 09.2021 |Kheiron Medical
Python, Pytest, Airflow, Docker, AWS, Redshift, S3, PostgreSQL, AWS Glue Data Catalog, ETL, ELT, Data Modeling, Normalization, Facts, Dimensions, Star Schema, Snowflake Schema
Kheiron leverages machine learning to diagnose breast cancer using X-ray images and metadata (DICOM). ● Designed and implemented data pipelines, performing data normalization and ETL/ELT on medical data, including DICOM files, to support machine learning workflows.
Senior Back-end Developer
05.2020 - 04.2021 |Soda
OOP, Python, Java, Jersey Framework, PostgreSQL, MySQL, BigQuery, Redshift, Athena, Snowflake, GitHub Actions, Data Warehouses, CI/CD
Soda produces metrics and alarms for evaluating and monitoring data quality. ● Developed and maintained features for the platform’s Python SDK, including support for new data warehouses such as Snowflake and BigQuery. ● Built new features for the Java RESTful back-end using Java and Maven. ● Handled DevOps tasks, including implementing CI/CD pipelines using GitHub Actions.
Senior Back-end Developer & Data Engineer
02.2019 - 05.2020 |PolySwarm
Python, Elasticsearch, SQLAlchemy, PostgreSQL, Flask, Django, Celery, Kafka, Scala, Gatling, Jupyter Notebooks, Pandas, Spark (PySpark), EMR, Kubernetes (Helm, kubectl, charts), AWS (S3), Docker, GitLab CI, Gatling
PolySwarm leverages blockchain technology to manage multiple third-party malware detectors and generate a consensus prediction (i.e., determining whether a file is malware or not). It also serves as a malware database used by security researchers, similar to VirusTotal from Google. ● Configured and deployed Elasticsearch settings to enable malware metadata search, including mapping index types, developing pipelines/processors/tokenizers/analyzers using "Painless" scripting and Python, and writing Elasticsearch DSL queries. ● Developed similarity detection using TLSH/SSDEEP hashing and clustering of categorical features via K-modes and EMR/PySpark. ● Built REST APIs using Flask, SQLAlchemy, Django, and Celery. ● Implemented load tests using Gatling in Scala. ● Created data pipelines for processing malware metadata and generating insights.
Small Business Owner
since 02.2019 - Till the present day |Karakuri Apps
REST APIs, FastAPI, SQLAlchemy, Pydantic, Django, DRF, Django Templates, PostgreSQL, Redis, ActiveMQ, MemCache, LLMs, RAG, OpenAI, LangChain (Python), Rasa, Jupyter, Scikit-learn, Hugging Face, PyTorch, TensorFlow, Airflow, Pandas, Redshift, BigQuery, S3, Google Storage, Kafka, PyMuPDF, Unstructured.io
● I operate my own small company, primarily to generate invoices for my contract roles. ● I provide consultancy services in software development, specializing in full-stack development, data engineering, and machine learning for startups and IT companies. ● My clients include PolySwarm, Marketer, Silvr, and Finmatics.
Student
08.2018 - 01.2019 |Sabbatical
Studies
I took time off to enhance my skill set, particularly in data science and machine learning, completing several certified training programs, including the following: ● Coursera/Johns Hopkins' Data Science Specialization (10 courses, including a capstone project). ● Coursera/deeplearning.ai: Deep Learning Specialization (4 courses). ● Coursera's Functional Programming in Scala Specialization (5 courses, including a Spark/Scala course and a capstone project). ● Additionally, I completed various other courses in data engineering tools, such as Spark (in Python, Scala, and Java), web development (Ruby on Rails, JavaScript, HTML/CSS, AngularJS), data science (statistics, data analysis, data visualization), and machine learning (R, Jupyter, Pandas, TensorFlow, H2O). Coursework: ● CS-212: Design of Computer Programs. ● CS-215: Algorithms, Crunching Social Networks. ● CS-222: Differential Equations in Action, Making Math Matter. ● CS-373: Programming a Robotic Car. ● CS-387: Applied Cryptography, The Science of Secrets. ● Introduction to Artificial Intelligence, Advanced Track. ● Introduction to Complexity (SFI/Complexity Explorer). ● Introduction to Non-Linear Dynamics and Chaos (SFI/Complexity Explorer). ● Machine Learning, Advanced Track (Coursera). ● Non-Linear Dynamics, Mathematical & Computational Aproaches (SFI/Complexity Explorer). ● Quantum Mechanics for Scientists & Engineers (Stanford Online). ● ST-101: Intro to Statistics, Making Decisions Based on Data (Udacity).
Senior Back-end Developer
05.2016 - 08.2018 |Zalando
OOP, Scala, Java, Clojure, AWS, Kafka, Kafka Streams, Spark, Python, CI/CD (GitHub, Jenkins), TDD (ScalaTest), Agile/SCRUM (JIRA)
Zalando is one of the largest e-commerce retailers in Europe. ● Developed back-end services using Scala and Play Framework. ● Created migration tools in Python. ● Configured CI/CD pipelines using Jenkins and Build per Branches. ● Contributed to a machine learning classifier developed in Clojure with Sparkling. ● Worked on data processing jobs using Scala with EMR and Spark. ● Maintained a legacy web scraper developed in Scala and Kafka.
Senior Back-end Developer
12.2014 - 05.2016 |Dun & Bradstreet
OOP, Java, J2EE, Web Services (RESTful, Jersey), Web Technologies (Spring, HTML, CSS, FreeMarker), Amazon AWS (DynamoDB, S3, Elastic Beanstalk), Hadoop, Hive, Impala, HBase, Maven, Continuous Integration (Git, Stash, Jenkins), TDD (JUnit, Hamcrest, WireMock), BDD (JBehave), Tomcat, Python, Bash, Linux (CentOS 6), IntelliJ, Agile/SCRUM (JIRA)
Dun & Bradstreet is a leading global provider of business decisioning data and analytics. The company empowers organizations to make informed decisions by delivering comprehensive financial, credit, and risk management information. ● Designed and implemented a test automation framework based on BDD (Behavioral Driven Development) using JBehave, Spring, and Java. ● Developed back-end features using Java and Jersey.
Senior Back-end Developer & Data Engineer
08.2013 - 12.2014 |AOL
OOP, Java, J2EE, Web Services (RESTful, Jersey), Web Technologies (Spring, Hibernate, JavaScript, HTML, CSS, ExtJS, CoffeeScript), TDD (JUnit, Hamcrest, Mockito), BDD (JBehave, Jasmine), ActiveMQ, Python, Bash, Oracle, MySQL, Hadoop (Java MapReduce, Pig, and Streaming Jobs), Eclipse, Continuous Integration (Git, Stash, Jenkins), Linux (CentOS), Agile/SCRUM (JIRA, VersionOne)
AOL is a global leader in digital media and online advertising, recognized for its innovative technology and extensive audience reach, offering a wide range of products and services that cater to consumers and businesses alike. ● Developed the eDemo Project (Demographic Data Collection), collaborating across multiple AOL targeting modules. This project involved collecting demographic data (age, gender, profession, location) from cookies to optimize ad targeting across AOL and partner sites. ● I implemented data parsing using Avro, integrated with a NoSQL storage system (AeroSpike), and utilized a REST interface for configuration stored in an Oracle Database. ● The project followed Agile/SCRUM methodologies and employed a continuous integration environment (Jenkins, Maven, PyBot). ● Performed maintenance and defect fixing, dedicating six months to resolving production issues and achieving the best performance review across all targeting teams. ● Additionally, I contributed to mining impression data (Hadoop Java, PIG/UDFs, Streaming/Perl) for customer billing.

Educational background

Electrical Engineering, Computer Science
Till 1999
USP - Universidade de São Paulo

Languages

EnglishUpper Intermediate