Tools Used for Collecting Online Data


An Educational Service of the American Library Association

Office for Information Technology Policy


Prepared by Leslie Harris & Associates in conjunction with OITP staff



Many web sites collect information online, ranging from merely tracking how many hits a web page receives in a given day or month, to more sophisticated analysis of the user's Internet connection, computer, software, cookies, and stealth data recording software.  Because web sites collect personally identifiable information from patrons accessing the Internet from their libraries, it is useful for librarians to understand the variety of technologies used by web site administrators to collect information from its visitors. 


As an initial matter, web site administrators have the ability to collect certain basic information about the users that visit their web sites.  Administrators may record the IP address (a set of four numbers, each between zero and 255, separated by periods that uniquely identifies a computer or other hardware device on the Internet) of each computer that accesses their web sites.  Additionally, web site administrators may discern the path a user takes through a web site - in other words, they can record the sites from which the user enters and to which the user exits.


Many web sites use slightly more complex technologies such as cookies to analyze traffic and purchase patterns and to customize users' online experiences.  A cookie is a text-only string of data containing information that is unique to you that is entered into the memory of your browser and sent back to a web server when you revisit a web site.  Cookies contain information such as log-in or registration information, online "shopping cart" information (your online buying patterns in a certain retail site), user preferences, and the last web site visited.  Some web sites use "session" or "transient" cookies that track users only for a short period of time (generally, one "session") and are stored in temporary memory files on the user's computer.  Other web sites use "persistent cookies," which may track a user's Internet habits over an extended period of time.  Persistent cookies permit a user to be "remembered" by a web site from one visit to the next, and remain on a user's hard drive until they are either erased, or expire.


Another technology that is often used is stealth data recording software, a technology that is installed without the user's knowledge to record personally identifiable information "behind the scenes" and send it to a third party.  Stealth software may either be an independent program or a program that is embedded within another software application.  Generally, a user unknowingly installs stealth software at the same time he or she installs a third-party application on a computer (either from a CD-ROM or from the Internet).  Sometimes, stealth software is installed on a computer during an online transaction, such as purchasing software or clothing from a web site.  Once installed, stealth software tracks personally identifiable information and periodically sends that information to a third party.


The use of these technologies is of particular concern when used to conduct data mining.  Data mining is the practice of aggregating information about consumers' preferences and interests from a variety of sources, including cookies, stealth data software, voluntary purchases, and mailing lists, with the purpose of creating comprehensive profiles.  Most often the profiles are used for targeted advertisements, but federal and local governments are also increasingly relying on data mining to assemble profiles to investigate criminal and fraudulent activities. 


Further information:


CDT Consumer Privacy Guide:


Webopedia Definition & Links:


NYT Article "Fighting to Make a City's Cookie Files Public" on a legal battle over whether "cookie" files are public records. (Site requires registration, and cookie acceptance)


Cookie Central - Frequently Asked Questions About Cookies:


Microsoft/Internet Explorer Information on Cookies:


Netscape Tech Support, "Cookies: What They Are and How They



"A recipe for cookie management: Integrate an easy-to-use library for client-side cookie handling" (highly technical article on using java for cookie management)



Copyright 2002, American Library Association, Office for

Information Technology Policy




This Online Privacy Tutorial is a service of the American Library Association. The content of this tutorial is primarily the work of Leslie Harris & Associates in Washington, DC. The views expressed in these messages are not necessarily the views of ALA or Leslie Harris & Associates. This tutorial is for information only and will not necessarily provide answers to concerns that arise in any particular situation. This service is not legal advice and does not include many of the technical details arising under certain laws. If you are seeking legal advice to address specific privacy issues, you should consult an attorney licensed to practice in your state.