What we want:
We are seeking a motivated Data Engineering Intern to join our team of analytics experts. The intern will assist in expanding and optimizing our data and data pipeline architecture as well as optimizing data flow and collection for crossfunctional teams. This role is ideal for someone who is eager to learn about data systems and enjoys working with big data technologies.
Who we are:
Vertoz (NSEI: VERTOZ) an AIpowered MadTech and CloudTech Platform offering Digital Advertising Marketing and Monetization (MadTech) & Digital Identity and Cloud Infrastructure (CloudTech) caters to Businesses Digital Marketers Advertising Agencies Digital Publishers Cloud Providers and Technology companies. For more details please visit our website here.
What you will do:
Development: Assist in creating and maintaining scalable big data applications using Python Spark Hive and Impala.
Data Pipelines: Help develop and optimize data processing pipelines to handle large datasets.
Integration: Support the implementation of data ingestion transformation and loading processes.
Collaboration: Work with data scientists and analysts to meet data requirements.
Quality Control: Ensure data quality integrity and security.
Performance: Monitor and troubleshoot performance issues to improve efficiency.
Documentation: Participate in code reviews testing and documentation
Learning: Stay updated with industry trends and advancements in big data technologies.
Visualization: Assist in creating data visualizations and reports using Power BI.
Requirements
Basic proficiency in Python.
Familiarity with Apache Spark.
Exposure to Hive and Impala.
Understanding of Hadoop HDFS Kafka and other big data tools.
Knowledge of data modeling ETL processes and data warehousing concepts.
Basic knowledge of Power BI for creating visualizations and reports.
Soft Skills: Excellent problemsolving communication and teamwork skills.
Benefits
No dress codes
Flexible working hours
5 days working
24 Annual Leaves
International Presence
Celebrations
Team outings
Basic proficiency in Python. Familiarity with Apache Spark. Exposure to Hive and Impala. Understanding of Hadoop, HDFS, Kafka, and other big data tools. Knowledge of data modeling, ETL processes, and data warehousing concepts. Basic knowledge of Power BI for creating visualizations and reports. Soft Skills: Excellent problem-solving, communication, and teamwork skills.
Education
Graduate or Post Graduate