Show HN: I Scraped 2,200 Software Engineering Jobs from Career Pages Using LLMs

grepjob.com

8 points by kylem866 4 days ago

Hi everyone,

I built GrepJob because I got frustrated with the user experience of LinkedIn/Indeed while looking for a new SWE job.

Specifically: 1) Not being able to trust the date posted of any job. The date shown on LinkedIn is often the date the job was reposted by the recruiter and 2) Being shown too many irrelevant jobs. For example, I get shown senior/staff level roles when I search for "Software Engineer II"

GrepJob solves 1) by populating the date_posted time directly from each company's ATS system And 2) by extracting seniority, specialty (frontend, backend, etc.), and tech stack from each job with LLMs

Please let me know if you have any feedback, thanks!

toomuchtodo 4 days ago

You should connect with the person building https://hiring.cafe. They are scraping something like 1.6M jobs using ChatGPT, might be some collaboration opportunity or knowledge transfer.

https://news.ycombinator.com/item?id=42806956

Worst case, proven pattern to emulate. Wishing you success!

  • kylem866 4 days ago

    Thanks! I've actually already sent a message to the hiring cafe creator and didn't hear back. Might be worth another shot

wbakst 2 days ago

cool stuff! I wish there were a fuzzy search / filter bar to make it easier to search for more specific things.

I'm also curious, what are you using to structure the outputs?

spicy_ranch 4 days ago

I really enjoy the simple, elegant design and look of this site. Well done!

I did notice that the mid-level jobs are returning mainly senior roles though.

  • kylem866 4 days ago

    Thanks! Yeah I have noticed accuracy problems with the seniority too. I'm using 4o-mini + structured output to extract the seniority. Currently the seniority output is defined as an array to handle edge cases where a job could technically be either mid level or senior. But, in reality the LLM is over eager at assigning multiple seniorities. It frequently gives a mid level seniority to jobs which literally have 'Senior' in the title. I'll work on it!