After having watched a ton of Node.js tutorials (and TAing for a JS class), I decided a while ago “for my next script, I’m totally going to use Node.”
So I finally got the opportunity this last week to write a script. Tasked with a menial job, making a script to accomplish it brightened my day.
The first script was dealing with an xml api feed. So I immediately found xml2js, a nice converter and set about looping through some api urls, collecting the data I needed and totaling it up. It was a mess, and looked like this:
var https = require(“https”);
var parseString = require(‘xml2js’).parseString;
https.get(“https://someplace/someapi”, function(response){
var body = ”;
response.on(“data”, function(chunk) {
body += chunk;
});
response.on(“end”, function(){
//console.log(body);
parseString(body, function (err, result) {
totalEntries += result.feed.entry.length;
for(var i=0; i < result.feed.entry.length; i++){
something += parseInt(result.feed.entry[i]['something'][0]['somethingelse'][0].$.thingiwant);
}
console.log("Total stuff: " + something);
});
});
}
[/sourcecode]
This one was easy to get what I needed, but clearly not the right way to do it. Because the functions happen asynchronously, blah blah blah, that's not what I'm writing about.
The next one was very similar, but I had to scrape a webpage, not just xml data. So I found a nice lib called jsdom, which created a dom for me to use jquery on.
[sourcecode language="javascript"]
var jsdom = require("jsdom");
jsdom.env(url, function(errors, window){
var $ = require("jquery")(window);
var total = 0;
$(".some_class").each(function(key, value){
// just use a regex to get it
// it's buried in the onclick, so I'll have to use a regex regardless...
var result = value.innerHTML.match(/newWindow\('([^']*)'/)[1]; // get first grouping
jsdom.env(host + result, function(errors, window){
var $ = require("jquery")(window);
// use regex to get the xxxxxxx because I'm lazy
var result = $('head').html().match(/someRegex/g);
if(result !== null){
for(var i = 0; i < result.length; i++){
var thing = result[i].match(/"([^"]*)"/)[1]; // get first grouping
total += thing;
}
}
});
});
});
[/sourcecode]
This was super easy / super powerful to use something I'm already so familiar with to accomplish a task that is well suited to that. The scripts themselves took minutes to write -- if you don't take into account the time I spent finding where to get what I needed.
Yes. Phantom would have been a better choice.