PLACED is a young SaaS company based in Berlin-Mitte. With our AI-based platform we are revolutionizing the world of recruitment agencies by helping them to find the perfect jobs for their candidates faster and take their sales to a new level. Our goal is clearly defined: We want to become the next unicorn in the SaaS sector! Were looking for a sharp and detail-oriented Working Student to support our data engineering team with large-scale web scraping and structured data extraction tasks. You will play a critical role in building a scalable pipeline that fetches cleans and structures job data from various job boards ATS portals and career websites. Learn and grow with us!
Tasks
- Go through job boards career pages and portals to collect job listing links
- Get the web page content (HTML) from each link using tools like requests curl or services like ScrapeOps
- Use Scrapy to build and manage web crawlers that pull out useful data from those pages
- Clean up and organize the data and turn it into a neat JSON format
- Deal with messy web pages that have inconsistent structures or complex elements
- Upload the final cleaned data to an Amazon S3 bucket (cloud storage) in the right format
- Add error handling and logging so your code runs smoothly even when there are issues
- Work closely with the product and data teams to make sure the data is accurate and useful
Requirements
- Youre currently studying Computer Science Data Science or a similar field at a german university
- Youre comfortable writing scripts in Python and enjoy automating things
- Youve worked with tools like Scrapy requests or BeautifulSoup to scrape websites
- You know how to read and work with HTML and JSON data
- Youve heard of APIs and maybe even used one before
- You understand basic cloud tools like AWS S3 (or are eager to learn)
- You know the basics of good web scraping (like being polite to websites using waiting between requests etc.)
- Youve used Git to manage your code (e.g. GitHub projects or university work)
- You pay attention to detail like solving problems and can work on tasks by yourself
- Bonus if youve used tools like Playwright or Selenium or have dealt with websites that are hard to scrape (like ones that use JavaScript or CAPTCHAs)
Benefits
- Flexible working hours tailored to your university schedule
- Learn hands-on from a team building real-world automation and data pipelines
- Gain experience with web-scale data collection and cloud deployment
- Opportunity to grow into a full-time role upon graduation
Interested Send us your CV GitHub/portfolio and a short note about a scraping challenge youve solved.