Marcio Gualtieri

Senior

Registration: 16.05.2025

Specialization: Python Developer / Data Engineer / ML Engineer

— I am a Senior Full-Stack Developer with a strong focus on back-end development, data engineering, and machine learning. — My experience spans a wide variety of industries and company sizes, including advertising, finance, e-commerce, cybersecurity, and data analytics. — I've worked with both large corporations and startups. — In addition to my development expertise, I am a trained data scientist and have completed several certified specializations in data science and machine learning. — I adhere to clean code principles and am a dedicated practitioner of test-driven development (TDD). — As a lifelong learner, I enjoy expanding my knowledge by building projects. Industry Knowledge: — Full-Stack Development. — Back-End Web Development. — Data Engineering. — Object-Oriented Programming (OOP). — Data Science. — Extract. — Transform. — Load (ETL). — Natural Language Processing (NLP). — ETL Tools. — Big Data. — Cloud Infrastructure. — Web Development. — Front-End Development. — Machine Learning. — Agile Methodologies. — TDD. — Clean Code. — Generative AI. — Large Language Models (LLM). — Data Normalization. — Data Lake. — Data Marts. — Facs. — Dimensions.

Skills

Databases

Terraform

MySQL

PostgreSQL

MongoDB

Cascading Style Sheets (CSS)

React

Angular

SQL

NoSQL

Django REST Framework

FastAPI

SQLAlchemy

Pydantic

Airflow

Google Cloud Platform (GCP)

Docker

HTML

JavaScript

TypeScript

Bootstrap

Linux

SQL

Python

Amazon Web Services (AWS)

Git

LangChain

HuggingFace

TensorFlow

Torch

Scikit-learn

Deep Learning

LLM

Large Language Models

Generative AI

BigQuery

Redshift

Kubernetes

Work experience

Full-Stack Developer

03.2023 - 07.2024 |Finmatics

OOP, Python, Django, DRF, FastAPI, AngularJS, NodeJS, Bootstrap, JavaScript, TypeScript, Apex Charts, HTML, CSS, PostgreSQL, Elasticsearch, Kibana, GitLab, Kubernetes, Celery, Redis, Machine Learning, Helm, Kubectl, Jasmine, Pytest, Cypress, OCR, LLMs, NLP, OpenAPI/Swagger

● Finmatics leverages machine learning (specifically large language models, LLMs) to automate accounting tasks, such as processing paper invoices and predicting missing or ineligible information on invoices. ● Developed full-stack features for the SaaS accounting platform, involving both back-end and front-end changes. These features required a comprehensive understanding of the platform to ensure seamless integration across both layers. ● Performed extensive maintenance and defect fixing, addressing numerous issues introduced in previous releases, many of which were unrelated to my direct contributions.

Full-Stack Developer

07.2022 - 03.2023 |Silvr

OOP, Python, Django, DRF, PostgreSQL, Celery, Redis, Django Templates, Tailwind, JavaScript, Alpine.js, React.js, Node.js, HTML, CSS, Terraform, Jasmine, Playwright, Pytest, OpenAPI/Swagger

Silvr provides revenue-based financing for small companies, particularly in the e-commerce sector, catering to two types of users: businesses seeking financing and risk analysts who evaluate these companies' eligibility based on collected financial data (e.g., financial history, sales, and expenses). ● Developed full-stack features for Silvr’s revenue-based financing platform. Most of the features I built involved both back-end and front-end changes, requiring a deep understanding of both the transactional and analytical aspects of the platform. ● Performed maintenance and defect fixing, addressing numerous issues, the majority of which were introduced in previous releases and unrelated to my work.

Senior Back-end Developer & Data Engineer

09.2021 - 07.2022 |Marketer

OOP, Python, Django, DRF, Flask, Kubernetes, PostgreSQL, Airflow, Pandas, NumPy, GitLab (Auto DevOps), Google Cloud, JSON, SOAP, SQL, ETL, ELT, Data Modeling, Normalization, Facts, Dimensions, Star Schema, Snowflake Schema, Machine Learning, Pytest, Helm, Kubectl, OpenAPI/Swagger

Marketer leverages data analytics and machine learning to optimize real estate advertising. ● My team, the Data Science team, is responsible for ETL/ELT processes involving real estate data, as well as machine learning models that provide predictions (e.g., selling price, advertising impressions, and clicks). ● We also distribute real estate data and insights across the company via REST APIs, which are used for developing web and mobile applications. ● Developed the real estate data distribution back-end (Django, DRF) using microservices architecture. The API integrates data from multiple sources, including machine learning prediction APIs and data pipelines. ● Built Airflow DAGs to parse and store real estate data from customer CRM systems. ● Managed Kubernetes clusters and handled DevOps responsibilities, including CI/CD implementation via GitLab's Auto DevOps (auto-deploy, Helm). ● Developed RabbitMQ consumer services to sync real estate data from the REST API in real-time. ● Implemented end-to-end testing, covering the entire data flow from consumption to distribution via the REST API.

Data Engineer

04.2021 - 09.2021 |Kheiron Medical

Python, Pytest, Airflow, Docker, AWS, Redshift, S3, PostgreSQL, AWS Glue Data Catalog, ETL, ELT, Data Modeling, Normalization, Facts, Dimensions, Star Schema, Snowflake Schema

Kheiron leverages machine learning to diagnose breast cancer using X-ray images and metadata (DICOM). ● Designed and implemented data pipelines, performing data normalization and ETL/ELT on medical data, including DICOM files, to support machine learning workflows.

Senior Back-end Developer

05.2020 - 04.2021 |Soda

OOP, Python, Java, Jersey Framework, PostgreSQL, MySQL, BigQuery, Redshift, Athena, Snowflake, GitHub Actions, Data Warehouses, CI/CD

Soda produces metrics and alarms for evaluating and monitoring data quality. ● Developed and maintained features for the platform’s Python SDK, including support for new data warehouses such as Snowflake and BigQuery. ● Built new features for the Java RESTful back-end using Java and Maven. ● Handled DevOps tasks, including implementing CI/CD pipelines using GitHub Actions.

Senior Back-end Developer & Data Engineer

02.2019 - 05.2020 |PolySwarm

Python, Elasticsearch, SQLAlchemy, PostgreSQL, Flask, Django, Celery, Kafka, Scala, Gatling, Jupyter Notebooks, Pandas, Spark (PySpark), EMR, Kubernetes (Helm, kubectl, charts), AWS (S3), Docker, GitLab CI, Gatling

PolySwarm leverages blockchain technology to manage multiple third-party malware detectors and generate a consensus prediction (i.e., determining whether a file is malware or not). It also serves as a malware database used by security researchers, similar to VirusTotal from Google. ● Configured and deployed Elasticsearch settings to enable malware metadata search, including mapping index types, developing pipelines/processors/tokenizers/analyzers using "Painless" scripting and Python, and writing Elasticsearch DSL queries. ● Developed similarity detection using TLSH/SSDEEP hashing and clustering of categorical features via K-modes and EMR/PySpark. ● Built REST APIs using Flask, SQLAlchemy, Django, and Celery. ● Implemented load tests using Gatling in Scala. ● Created data pipelines for processing malware metadata and generating insights.

Small Business Owner

since 02.2019 - Till the present day |Karakuri Apps

REST APIs, FastAPI, SQLAlchemy, Pydantic, Django, DRF, Django Templates, PostgreSQL, Redis, ActiveMQ, MemCache, LLMs, RAG, OpenAI, LangChain (Python), Rasa, Jupyter, Scikit-learn, Hugging Face, PyTorch, TensorFlow, Airflow, Pandas, Redshift, BigQuery, S3, Google Storage, Kafka, PyMuPDF, Unstructured.io

● I operate my own small company, primarily to generate invoices for my contract roles. ● I provide consultancy services in software development, specializing in full-stack development, data engineering, and machine learning for startups and IT companies. ● My clients include PolySwarm, Marketer, Silvr, and Finmatics.

Student

08.2018 - 01.2019 |Sabbatical

Studies

I took time off to enhance my skill set, particularly in data science and machine learning, completing several certified training programs, including the following: ● Coursera/Johns Hopkins' Data Science Specialization (10 courses, including a capstone project). ● Coursera/deeplearning.ai: Deep Learning Specialization (4 courses). ● Coursera's Functional Programming in Scala Specialization (5 courses, including a Spark/Scala course and a capstone project). ● Additionally, I completed various other courses in data engineering tools, such as Spark (in Python, Scala, and Java), web development (Ruby on Rails, JavaScript, HTML/CSS, AngularJS), data science (statistics, data analysis, data visualization), and machine learning (R, Jupyter, Pandas, TensorFlow, H2O). Coursework: ● CS-212: Design of Computer Programs. ● CS-215: Algorithms, Crunching Social Networks. ● CS-222: Differential Equations in Action, Making Math Matter. ● CS-373: Programming a Robotic Car. ● CS-387: Applied Cryptography, The Science of Secrets. ● Introduction to Artificial Intelligence, Advanced Track. ● Introduction to Complexity (SFI/Complexity Explorer). ● Introduction to Non-Linear Dynamics and Chaos (SFI/Complexity Explorer). ● Machine Learning, Advanced Track (Coursera). ● Non-Linear Dynamics, Mathematical & Computational Aproaches (SFI/Complexity Explorer). ● Quantum Mechanics for Scientists & Engineers (Stanford Online). ● ST-101: Intro to Statistics, Making Decisions Based on Data (Udacity).

Senior Back-end Developer

05.2016 - 08.2018 |Zalando

OOP, Scala, Java, Clojure, AWS, Kafka, Kafka Streams, Spark, Python, CI/CD (GitHub, Jenkins), TDD (ScalaTest), Agile/SCRUM (JIRA)

Zalando is one of the largest e-commerce retailers in Europe. ● Developed back-end services using Scala and Play Framework. ● Created migration tools in Python. ● Configured CI/CD pipelines using Jenkins and Build per Branches. ● Contributed to a machine learning classifier developed in Clojure with Sparkling. ● Worked on data processing jobs using Scala with EMR and Spark. ● Maintained a legacy web scraper developed in Scala and Kafka.

Senior Back-end Developer

12.2014 - 05.2016 |Dun & Bradstreet

OOP, Java, J2EE, Web Services (RESTful, Jersey), Web Technologies (Spring, HTML, CSS, FreeMarker), Amazon AWS (DynamoDB, S3, Elastic Beanstalk), Hadoop, Hive, Impala, HBase, Maven, Continuous Integration (Git, Stash, Jenkins), TDD (JUnit, Hamcrest, WireMock), BDD (JBehave), Tomcat, Python, Bash, Linux (CentOS 6), IntelliJ, Agile/SCRUM (JIRA)

Dun & Bradstreet is a leading global provider of business decisioning data and analytics. The company empowers organizations to make informed decisions by delivering comprehensive financial, credit, and risk management information. ● Designed and implemented a test automation framework based on BDD (Behavioral Driven Development) using JBehave, Spring, and Java. ● Developed back-end features using Java and Jersey.

Senior Back-end Developer & Data Engineer

08.2013 - 12.2014 |AOL

OOP, Java, J2EE, Web Services (RESTful, Jersey), Web Technologies (Spring, Hibernate, JavaScript, HTML, CSS, ExtJS, CoffeeScript), TDD (JUnit, Hamcrest, Mockito), BDD (JBehave, Jasmine), ActiveMQ, Python, Bash, Oracle, MySQL, Hadoop (Java MapReduce, Pig, and Streaming Jobs), Eclipse, Continuous Integration (Git, Stash, Jenkins), Linux (CentOS), Agile/SCRUM (JIRA, VersionOne)

AOL is a global leader in digital media and online advertising, recognized for its innovative technology and extensive audience reach, offering a wide range of products and services that cater to consumers and businesses alike. ● Developed the eDemo Project (Demographic Data Collection), collaborating across multiple AOL targeting modules. This project involved collecting demographic data (age, gender, profession, location) from cookies to optimize ad targeting across AOL and partner sites. ● I implemented data parsing using Avro, integrated with a NoSQL storage system (AeroSpike), and utilized a REST interface for configuration stored in an Oracle Database. ● The project followed Agile/SCRUM methodologies and employed a continuous integration environment (Jenkins, Maven, PyBot). ● Performed maintenance and defect fixing, dedicating six months to resolving production issues and achieving the best performance review across all targeting teams. ● Additionally, I contributed to mining impression data (Hadoop Java, PIG/UDFs, Streaming/Perl) for customer billing.

Educational background

Electrical Engineering, Computer Science

Till 1999

USP - Universidade de São Paulo

Languages

EnglishUpper Intermediate