Latest Updates
Stay up to date with the latest features, tutorials, and insights about social media scraping
How to scrape YouTube transcripts with node.js in 2025
I’m going to show you how to scrape YouTube transcripts in node.js, but the technique can be used for any programming language. If you are just looking for a pre-built API, check out the scrape creators YouTube Transcript API. Scrape Creators also has transcript API’s for TikTok, Instagram, Facebook, and Twitter. Ok first, go to the youtube search page. In this case, I am going to search for “Charles Barkley Jussie Smollett” Next we want to see if we can find any API’s that YouTube is using to fetch the video and hopefully transcript, so open up the dev tools by Right clicking > Inspect Element Then go to the “Network Tab” To make things easier for us, filter by “Fetch/XHR” Now click on any video you want, and observe the requests. Notice the route: “next?prettyPrint=false” Click on that route and check out the Response. If you start searching for the video title or views, you’ll see them in this response: So cool, looks like we found the endpoint that YouTube is using to fetch video details. Now we want to actually call it in Node.js. Go to the next?prettyPrint=false endpoint and right click to Copy as fetch (Node.js) Make sure you have node-fetch installed with npm install node-fetch We’re going to be using async/await, so your code should look something like this: If you make the request, it should be successful. Whoohoo! 🥳 Nice job. But, we don’t want to fetch the same video over and over again, we want to dynamically fetch different videos. So if you notice in the request payload, there is a “videoId” field. And that’s pretty convenient for us, because it means we can just pass a different “videoId” and get the videos details. You can find the videoId easily because its always in the query params of the video. For example: https://www.youtube.com/watch?v=Y2Ah_DFr8cw So just make sure to pass whatever videoId you want to get, and that should be it. You can get rid of that params key. Cool, now how do we get the transcript? Well lets first see how YouTube is getting the transcript. Click the “...more” in the video description Clear the Network Requests so we can get an easier view of what happens when we click on the Show Transcript button Click on the Show transcript button Now they fire off the “get_transcript” endpoint, which doesn’t take a genius to figure out that that’s how they are getting the transcript. Click on the response, and you can see that the transcript is there in nice JSON for us. Excellent. Now lets look at how they’re calling it. So if you go to the Payload tab you’ll see an “externalVideoId”, which you can see is just the videoId, and there is this “params” value, which we did not need to fetch the video details, but spoiler alert, we will need it here. So where does it come from? Well, if you still have the video details response available, search for “getTranscriptEndpoint” If you copy the params value and search for it in the video details response, you’ll see that is identical to the params value YouTube is using to get the transcript 🙌 So now, to actually call the endpoint to get the transcripts, Copy the get_transcript request like we did for the video details request. Then make sure to pass the externalVideoId and the getTranscriptEndpoint param from the video details endpoint, and thats it!
June 1, 2025
Instagram Scraping in 2025: The Workarounds That Still Work
First, for all the lawyers out there reading this, we're only getting *public* data, so just relax. Aight lets get into it: Now, if you try to scrape via the browser, you're already cooked. Using tools like puppeteer or selenium are easy to detect. If you want to find the API that gets someone's public profile (their follower count, bio, links, etc), do this: Visit my IG page (adrianhorning) Next, we need to monitor the network requests as they're happening> Open the dev tools with right click > Inspect > Network (filter by Fetch/XHR) Scroll to the bottom, to the "Related Accounts"> Click on one Observe the requests Notice one that is named "web_profile_info"? That is probably the one we need 😅 Next, the http client you use is super important. Use got-scraping from Apify. It is incredible at getting around stuff. And lastly, you'll need residential proxies. There are a lot of providers out there, I have my personal favorites. The just pretty much do that for all the public information you want. Sometimes you need to include the headers that they are, so pay attention to that. And to get additional pages of results, they use cursor based pagination, so just look for "cursor" (most of the time), in the payload. If you want a solution that's already done for you, you can just call my IG profile endpoint, docs here: https://docs.scrapecreators.com/v1/instagram/profile
May 11, 2025
Instagram API Without Rate Limits: How to Easily Access Public Data
The Instagram Graph API is awesome, but it has a rate limit of 200 calls per hour * the number of users. So if you're doing any kind of volume, this really sucks and is a huge bottleneck for your project. To get around this, you could try scraping Instagram yourself, using proxies, running headless browsers, constantly dodging bans, etc. But unless you want to babysit all that headache 24/7, there’s a much easier way: Scrape Creators gives you direct access to public Instagram data, without rate limits, and it's really easy to use. Here’s how easy it is to pull public Instagram data using Scrape Creators: Example: Get Public Profile Data Let’s say you want to pull public info about an Instagram user like nike. You want some basic information like bio, website, follower count, recent posts, etc. By the way the documentation for this api is here: Instagram Profile Endpoint All you need to do is make a GET request to /v1/instagram/profile like so: To get an API key, just sign up at app.scrapecreators.com (free trial, no credit card required) That will return something like this, which has that basic information, and even the users recent posts also: If you want to get just the posts of the user, use the /v2/instagram/user/posts endpoint. And that's it! How to Get Started Sign up for a Scrape Creators account Get your API key Start making requests You can check out the full API docs here. If you want reliable, no-drama access to public Instagram data, Scrape Creators makes it stupid simple. Sign up here and get started.
April 28, 2025
How to scrape the Meta Ad Library
The Meta Ad Library is a goldmine for marketers, researchers, and growth, but it wasn’t built with automation in mind. That’s why I built the Scrape Creators API to pull real-time Meta Ad Library data using simple HTTP requests. In this post, I’ll show you exactly how to do that using **JavaScript with Axios**. But, you can use any no-code platform also like Clay, Make or Zapier. Anything where you can make API calls from. You’ll also find: - A video walkthrough on YouTube - A working code example on GitHub - Live demos for searching ads, finding companies, and more 🎥 Watch the Video 👉 Click here to watch on YouTube 🧠 What You'll Learn - How to search for ads by keyword - How to look up a company page - How to get all ads from a specific advertiser - How to retrieve a single ad by ID - Copy/paste-ready Axios code examples 🛠 Setup Before you start: - Sign up at [Scrape Creators](https://scrapecreators.com) and grab your API key - If you are using JavaScript (Node.js), install Axios in your project: 🔍 Search Ads by Keyword You know how you can search for ads by a keyword in the Meta Ad Library? Well, you can do the same thing programmatically! All you have to do is call the Search Ads endpoint with a `query` parameter, like so: And you can even search by Ad Type, Country, Status, and Start and End Date! The response will look something like this. Here are some of the valuable fields that will be returned: `ad_archive_id` -> this is the ad id `start_date_string` `end_date_string` `publisher_platform` -> to know if the ad is running on Instagram, Facebook, etc `snapshot` -> This is where all the ad copy and variations will be `body` -> body of the ad `cta_text` `link_url` -> where the user goes after they click on the ad `page_id` -> The company's ad library page id `videos` `images` And if you want more pages of results, just pass the `cursor` that is returned. Get Individual Ad Details Lets say we want to get the ad detail of a single ad. For that we just need the Library ID, in this case 1755312555404167. You will get the same details as above, including start_date_string, end_date_string, publisher_platform, snapshot, cta_text, link, etc. Search for a company's ads Now lets say we wanted to just get the ads from lululemon. First we need the lululemon page id. If you already know it, great, hold onto it. If you don't, you can find it by using this endpoint: That will return a `searchResults` array. You want to key into the first search result and return the page_id, like so: Then you can use that to fetch their ads using this endpoint: And the response will almost be identical to the Search endpoint. If you want additional pages, a `cursor` will be returned, and you need to pass that to subsequent requests like so: And that's it! 📦 Get the Full Code Want to see all this code in one file? 👉 View on GitHub ✅ Ready to Start? Create your free API key at Scrape Creators and start pulling Meta Ad Library data into your app. Questions or feedback? Hit me up at adrian@thewebscrapingguy.com Other APIs If you're looking to scrape other Ad Libraries like the LinkedIn Ad Library, or Google Ad Transparency Center, check out Scrape Creators. Also, if you want to scrape Instagram, YouTube, TikTok, Twitter, and more, you can also use scrape creators for that too!
April 20, 2025