3千５百万个Google Profile 下载到本地　= 1个月

2011-5-25 23:58| 发布者: dalaohu | 查看: 1692| 原文链接

最近网络安全问题凸现，Sony　爱立信online store 继Playstation network后又被黑掉。不少文章在质疑Cloud安全性问题。
在家上这位老兄，完全合法得吧Google　得用户profile　用１个月得时间下载到本地db。　如此不间断，从同一IP发出得大量request，居然没有受到google　安全系统得任何限制，和自动保护措施。

1 Database Containing 35.000.000 Google Profiles. Implications?

This is a follow-up to my previous blogpost on this topic.

In February 2011 it showed trivial to create a database containing ALL ~35.000.000 Google Profiles without Google throttling, blocking, CAPTCHAing or otherwise make more difficult mass-downloading attempts. It took only 1 month to retrieve the data, convert it to SQL using spidermonkey and some custom Javascript code, and import it into a database. The database contains Twitter conversations (also stored in the OZ_initData variable) , person names, aliases/nicknames, multiple past educations (institute, study, start/end date), multiple past work experiences (employer, function, start/end date), links to Picasa photoalbums, .... -- and in ~15.000.000 cases, also the username and therefore @gmail.com address. In summary: 1 month + 1 connection = 1 database containing 35.000.000 Google Profiles.

My activities are directed at inciting, or poking up, debate about privacy -- NOT to create DISTRUST but to achieve REALISTIC trust -- and the meaning of "informed consent". Which, when signing up for online services like Google Profile, amounts to checking a box. How can a user possibly be considered to be "informed" when they're not made aware 1) about the fact that it does not seem to bother Google that profiles can be mass-downloaded (Dutch) and 2) about misuse value -or hopefully the lack of it- of their social data to criminals and certain types of marketeers? Does this enable mass spear phishing attacks and other types of social engineering, or is that risk negligible, e.g. because criminals use other methods of attack and/or have other, better sources of personal data? Absence of ANY protection against mass-downloading is the status quo at Google Profile. Strictly speaking I did not even violate Google policy in retrieving the profiles, because http://www.google.com/robots.txt explicitly ALLOWS indexing of Google Profiles and my code is part of a personal experimental search engine project. I.e. at the time of this writing, that robots.txt file contains:

Allow: /profiles
Allow: /s2/profiles
Allow: /s2/photos
Allow: /s2/static

I'm curious about whether there are any implications to the fact that it is completely trivial for a single individual to do this -- possibly there aren't. That's something worth knowing too. I'm curious whether Google will apply some measures to protect against mass downloading of profile data, or that this is a non-issue for them too. In my opinion the misuse value of personal data on social networks ought to be elicited before publishing it under a false perception of informed consent. One possible outcome

My activities were performed as part of my research on anonymity/privacy at the University of Amsterdam. I'm writing a research paper about the above. Repeating from my previous post: this blog runs at Google Blogger. I sincerely hope my account "mrkoot" and blog.cyberwar.nl will not be blocked or banned - I did NOT publish the database and did NOT violate any Google policy.

Contact me by e-mail(*): kootNO_SPAM_PLEASE@uva.nl (remove "NO_SPAM_PLEASE")
Contact me on Twitter: http://twitter.com/mrkoot.

(*)I prefer insults to be sent to mrkoot@gmail.com, as gmail has superior filters.

		自动登录	找回密码
密码			注册

3千５百万个Google Profile 下载到本地 = 1个月

3千５百万个Google Profile 下载到本地　= 1个月