Senior Lead Data Engineer - Remote AI/ML Social Media Platform
 Remotely 
 Full-time 
Are you a seasoned Data Engineer eager to revolutionize how artificial intelligence transforms social media content creation? Our cutting-edge platform harnesses advanced AI to reinvent how content is created and published across major social networks. We're seeking a talented Lead Data Engineer to architect and maintain the sophisticated data infrastructure powering our AI systems.
Key Responsibilities
- Design, develop, and maintain robust, fault-tolerant data pipelines for collecting, processing, and storing data from multiple social media platforms and complex user interactions.
- Architect comprehensive data warehouse solutions utilizing modern cloud technologies (AWS Redshift, Azure Synapse) that scale effectively with our growing user base and expanding feature set.
- Implement rigorous data quality checks and validation processes to ensure the integrity, accuracy, and reliability of social media data consumed by our machine learning models.
- Develop and automate Extract, Transform, Load (ETL) processes using industry-standard tools like Apache Airflow and dbt to streamline data ingestion and transformation, significantly reducing manual intervention.
- Monitor and continuously optimize data pipelines to improve throughput, reliability, and scalability, ensuring 99.9%+ uptime for our AI-powered content assistant.
- Collaborate closely with Data Scientists, ML Engineers, and cross-functional teams to translate business requirements into efficient data infrastructure solutions.
- Enforce stringent data governance practices, guaranteeing data privacy, security, and compliance with relevant regulations including GDPR, CCPA, and industry-specific requirements.
- Establish quantifiable performance benchmarks and implement comprehensive monitoring solutions to proactively identify and address bottlenecks or anomalies.
- Partner with data analysis teams to design interactive, real-time dashboards that enable data-driven decision-making across the organization.
- Develop and support specialized data marts that provide actionable insights into social media trends, user engagement patterns, and content performance metrics.
- Evaluate emerging data technologies, tools, and frameworks to continually enhance our data engineering capabilities and maintain competitive advantage.
Required Skills & Qualifications
- Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems, or related technical field.
- 5+ years of proven experience in data engineering, with demonstrable success in ETL process design, data pipeline development, and data quality assurance.
- Strong proficiency in Python 3.9+ and SQL (PostgreSQL, MySQL, or similar), with experience using modern data engineering libraries like Pandas, NumPy, and Apache Airflow.
- Extensive hands-on experience with cloud-based data storage and processing solutions, particularly AWS (Redshift, S3, Glue, Lambda) and/or Azure (Data Factory, Synapse Analytics, Blob Storage).
- Practical knowledge implementing and maintaining data pipelines using technologies such as Apache Spark 3.x, Kafka, or similar streaming/batch processing frameworks.
- Experience designing and optimizing data warehouses and data lakes for performance, cost-efficiency, and scalability.
- Demonstrated ability to implement data governance frameworks and ensure compliance with privacy regulations in multi-national contexts.
- Proficiency with version control systems (Git), CI/CD pipelines, and infrastructure-as-code practices.
Why Join Our Team
Become an integral part of a forward-thinking team building next-generation AI technology for social media content creation. We offer competitive compensation, flexible remote work arrangements, opportunities for professional development, and the chance to solve complex data challenges at scale. Your contributions will directly impact millions of users and shape the future of AI-assisted content creation across global social platforms.
