Daniel Greenstein

Experience comes from Open Content Alliance (content partners like UC, Internet Archive; tech partners like HP Labs, Yahoo, etc.; funders).

Interesting notes from talk:

Melvyl (UC library catalog) has three recommender systems in place (a "more like this" feature, recs produced by an algorithm run on their circ data, and recs from Amazon)

UC has found that faceted browse is more useful for non-experts (like K-12, community college, and undergraduate students) than a search box (since they don't know what's behind it or what words to use). The California Digital Library gives ways to drill down (geography, time period, etc); they provide thumbnail images and instant results but also the ability to drill down more and refine still stays no matter what level you go to. This, he points out, required a lot of metadata augmentation since all the items in the library came from varying sources; massive text digitization would require less because of robust existing standards and the presence of full text.

Open services definitions–he was all about building collections in a way that allowed others to build on top and repurpose (I was wondering if this was another way of saying open architecture or talking about APIs).
Advantages of mass digitization:

Greater reference linking and discovery (not just for journal literature but for books, images, sound, data (which scientists have requested), etc) in an automated fashion

Curation-ability to build corpus or canon that you couldn't do with physical holdings (e.g. APIS-APIS, Advanced Papyrological Information System)

Localiztion or metasearch across items people in a local community (intellectual as well as physical) care about. This kind of catering to small communities (esp. small scholarly communities) is the sort of thing Microsoft, Google, Yahoo and the like won't do because the market's too small. if we don't do it, no one will.

Dissemination of content in new ways (Amazon model of print-on-demand, OCLC library toolbar, Trove.net image reproductions)

Efficiencies–high density storage facilities. UC stored 1 set of 23,000 volumes represented in JSTOR. If all their libraries got rid of their duplicate copies (which they weren't required to do) UC would save 3.8 million in shelving costs

Enrichment of existing texts (again the idea of building services on top of digital collections)–He mentioned Joe Esposito's Processed Book project (article in First Monday at http://www.firstmonday.org/issues/issue8_3/esposito/ and demo at http://prosaix.com/pbos/) and Wikipedia as examples of cases of scholars and laypeople who are willing to and interested in adding to a text

Personalization–he said that it was important to allow local users to present local views of content and to add a local narrative to those views

Funding–He basically said we'd have to choose what our priorities were and make a commitment. He also pointed out that it wasn't as much a matter of finding new money (as reallocating the money we're spending on maintaining physical collection? Must go back and re-listen)

In Q&A someone asked a question (must also re-listen to hear where this comment originated) but he said that as a research university (which was't about teaching) UC wasn't going to be taking care of the pedagogical services needed on top of digital collections but he hoped someone else would so they could take advantage of it. The way he put it sounded a bit harsh but thinking about it later, it was honest (though the instruction librarians at Berkeley will probably string him up when he gets back). It's a sad truth that most research universities are more geared to research and major collections that teaching. But I don't think it's something that should or has to be.