Thursday, August 21, 2014

Defensive Shift - Turning the Tables on Surveillance

Like many people lately, I've been pondering the implications of pervasive surveillance, "big data" analysis, state-sponsored security exploits, and the role of technology in government. For one thing, my work involves a lot of the same technology: deep packet inspection, data analysis, machine learning and even writing experimental malware. However, instead of building tools that enable pervasive government surveillance, I've built a product that tells mobile smartphone users if their device, or a laptop connected to it, has been infected with malware, been commandeered into a botnet, or come under attack from a malicious website, and so on.  I'm happy to be working on applying some of this technology in a way that actually benefits regular people. It feels much more on the "good side" of technology than on the bad side we've been hearing so much about lately.

Surveillance of course has been in the news a lot lately, so we're all familiar with the massive betrayal of democratic principles by governments, under the guise of hunting the bogeyman. It's good that people are having conversations about reforming it, but don't expect the Titanic to turn around suddenly. There's far too much money and too many careers on the line to just shut down the leviathan of pervasive surveillance overnight. It will take time, and a new generation of more secure networking technologies.

Big data has also been in the news in some interesting ways: big data analysis has been changing the way baseball is played! CBC's David Common presents the story [1]:

http://www.cbc.ca/news/world/how-the-defensive-shift-and-big-data-are-changing-baseball-1.2739619

Not everyone is happy with the "defensive shift" - the process of repositioning outfield players based on batting stats that tell coaches how likely a batter is to hit left or right, short or long.  Longtime fans feel it takes away from the human element of the game and turns it into more of a science experiment.

I tend to agree.  And to be honest, until now deep traffic inspection, big data analysis, surveillance, and definitely state-sponsored hacking, have quite justifiably earned a reputation as, well, repugnant to any freedom-loving, democracy-living, brain-having person. Nevertheless, as powerful as big data analytics, machine learning, and network traffic analysis are, and as much as they have been woefully abused by our own governments, I don't think we've yet begun to see the potential for good that these technologies could have, particularly if they are applied in reverse to the way they're being used now.

Right now we're in a position where a few privileged, state-sponsored bad actors are abusing their position of trust and authority to turn the lens of surveillance and data analysis upon ordinary people, foreign business competitors[2], jilted lovers [3], etc.  The sea change that will, I think, eventually come is when the lens of technology slowly turns with relentless inevitability onto the government itself, and we have the people observing and monitoring and analyzing the effectiveness of our elected officials and public servants and their organizations.

How do we begin to turn the tables on surveillance?

Secure Protocols

As I see it, this "defensive shift" will happen due to several factors. First, because the best and brightest engineers - the ones who design the inner workings of the Internet and write the open-source software used for secure computing - are on the whole smart enough to know that pervasive surveillance is an attack and a design flaw [4], are calling for it to be fixed in future versions of Internet protocols [5], and are already working on fixing some of the known exploits [6].

One of the simplest remedial actions available right now for pervasive surveillance attacks is HTTPS, with initiatives like HTTPS Now[9] showing which web sites follow good security practices, and tools like HTTPS Everywhere[10], a plugin for your web browser that helps you connect to websites securely. There is still work to be done in this area, as man-in-the-middle attacks and compromised cryptographic keys are widespread at this point - a problem for which perfect forward secrecy[11] needs to become ubiquitous. We should expect future generations of networking protocols to be based on these security best practices.

Some people say that creating a system that is totally secure against all kinds of surveillance, including lawful intercept, will only give bad people more opportunity to plan and carry out their dirty deeds.  But this turns out not to be true when you look at the actual data of how much information has been collected, how much it all costs, and how effective it's actually been.  It yields practically nothing useful and is almost always a "close the barn door, the horse is out!" scenario. This, coming from an engineer who actually works in the area of network-based threat analysis, by the way.

Open Data

Second, the open data movement. Its not just you and I who are producing data-trails as we mobe and surf and twit around the Interwebs.  There's a lot of data locked up in government systems, too.  If you live in a democracy, who owns that data? We do. It's ours. More and more of it is being made available online, in formats that can be used for computerized data analysis.  Sites like the Center for Responsive Politics' Open Secrets Database [8], for example, shed a light on money in politics, showing who's lobbying for what, how much money they're giving, and who's accepting the bribes, er, donations.

One nascent experiment in the area of government open data analysis is AnalyzeThe.US, a site that let's you play with a variety of public data sources to see correlations. Warning - it's possible for anyone to "prove" just about anything with enough graphs and hand-waving. For real meaningful analysis, having some background in mathematics and statistics is a definite plus, but the tool is still super fun and provides a glimpse of where things could be going in the future with open government.

Automation

Third, automation. There's still a long way to go in this area, but even the slowness and inefficiency of government will eventually give way to the relentless march of technology as more and more systems that have traditionally been mired in bureaucratic red tape become networked and automated, all producing data for analytics. Filling in paper forms for hours on end will eventually be as absurd for the government to require as it would be for buying a book from Amazon.

With further automation and data access, the ability to monitor, analyze and even take remedial action on bureaucratic inefficiencies should be in the hands of ordinary people, turning the current model of Big Brother surveillance on its head. Algorithms will be able to measure the effectiveness of our public services and national infrastructures, do statistical analysis, provide deep insight and make recommendations. The business of running a government, which today seems to be a mix of guesswork, political ideology and public relations management, will start to become less of a religion and more of a science, backed up with real data. It won't be a technocracy - but it will be leveraging technology to effectively crowd-source government.  Which is what democracy is all about, after all.


[1] http://www.cbc.ca/news/world/how-the-defensive-shift-and-big-data-are-changing-baseball-1.2739619
[2] http://www.cbc.ca/news/politics/why-would-canada-spy-on-brazil-mining-and-energy-officials-1.1931465
[3] http://www.cnn.com/2013/09/27/politics/nsa-snooping/
[4] http://tools.ietf.org/html/rfc7258
[5] http://techcrunch.com/2013/10/11/icann-w3c-call-for-end-of-us-internet-ascendancy-following-nsa-revelations/
[6] https://www.fsf.org/blogs/community/gnu-hackers-discover-hacienda-government-surveillance-and-give-us-a-way-to-fight-back
[7] AnalyzeThe.US
[8] https://www.opensecrets.org/
[9] https://www.httpsnow.org/
[10] https://www.eff.org/https-everywhere
[11] http://en.wikipedia.org/wiki/Forward_secrecy#Perfect_forward_secrecy

Thursday, August 14, 2014

Repackaging node modules for local install with npm


If you need to install an npm package for nodejs from local files, because you can't or prefer not to download everything from the  npmjs.org repo, or you don't even have a network connection, then you can't just get an npm package tarball and do `npm install <tarball>`, because it will immediately try to download all it's dependencies from the repo.

There are some existing tools and resources you can try:

  • npmbox - https://github.com/arei/npmbox
  • https://github.com/mikefrey/node-pac
  • bundle.js gist -  https://gist.github.com/jackgill/7687308
  • relevant npm issue - https://github.com/npm/npm/issues/4210

I found all of these a bit over-wrought for my taste. So if you prefer a simple DIY approach, you can simply edit the module's package.json file, and copy all of its dependencies over to the "bundledDependencies" array, and then run npm pack to build a new tarball that includes all the dependencies bundled inside.

Using `forever` as an example:
  1. make a directory and run `npm init; npm install forever` inside of it
  2. cd into the node_modules/forever directory
  3. edit the package.json file
  4. look for the dependencies property
  5. add a bundledDependencies property that's an array
  6. copy the names of all the dependency modules into the bundledDependencies array
  7. save the package.json file
  8. now run `npm pack`. It will produce a forever-<version>.tgz file that has all it's dependencies bundled in.
Update: another proposal from the github thread (I haven't verified this yet):
  1. In online environment, npm install --no-bin-link. You will have a entire flattened node_modules
  2. Then, bundle this flawless node_modules with tar / zip / rar / 7z etc
  3. In offline environment, extract the bundle, that's it


Productivity and Note-taking

I told a friend of mine that I wasn't really happy with the amount of time that gets taken up by Slack and "communication and sched...