Catalogers Group

Thursday, February 14, 2002

 

Present:  Aiken, Aivazian, Bross, LoPear, Matthiesen, McBride, Mendes, Morehead, Norris, Priebe, Ratliff (recorder), Riemer, Riggio, Stumps

 

John opened with an OTNG (Orion The Next Generation) report.  The sites to be visited are U. of Iowa (Ex Libris), Kansas (Voyager), and Indiana (Sirsi), during March 13-22.  Innovative has dropped out at this point, as they are developing a new version of their system on Windows Millenium right now.

 

1.  ALA Midwinter report (John)

 

John made the acquaintance of people from McGill and Iowa who are using Ex Libris. At the vendor booth, he observed a demonstration of the general concept that you can do simultaneous searches across several databases, e.g. both the OPAC and other data files. 

 

He tried this out on the Boston College OPAC site http://ecs100.bc.edu:4545/ALEPH (called Quest), but could not do the searches simultaneously, although they could be done sequentially.  This reinforces the need for authority control in all databases (including metadata projects) so that cross-database searches will produce similar results.

 

MARBI -- John attended the discussion about adding URLs to authority records (a proposal he submitted and MARBI approved in 1998).  While he prefers adding 856 fields to authority records, LC wants to record URLs in a 670 $u (which is a very narrow purpose for this data).  In a 670, you have to summarize in the $b the data you found.  With an 856, only the URL need be input and you could just point online to the entity represented in the heading.  Apparently LC is afraid of the URLs getting outdated, a maintenance burden LC ought to be asking PCC members to consider sharing.  MARBI simply evaluated LC's proposal on a technical basis.  One lack is that no $u was defined for the 675 field.  John thought that, if there is a demand for this kind of value added in authority records, perhaps a vendor could buy LC's authority file, populate the records with URLs, and sell it to those CDS customers who might prefer the enhanced version of the records for the same price!

 

OCLC offered a very entertaining play The Cataloger, the Public Services Librarian and Metadata: Can this Marriage Be Saved? Publicity is available http://orc.dev.oclc.org:5103/corc-l/msg02138.html .  The playscript is due to be put online and the video offered for free loan via ILL.

 

LC booth -- LC Class Web will include an LC to Dewey classification correlation.  In the new product, a user's annotations will be retained with each update.  Some reference librarians have used this with the public.

 

PCC Participants' Meeting -- The Standards Committee will address metadata standards (such as Dublin Core and other non-MARC schemas) and recommend a standard set of data elements.  (See PCC minutes at: http://www.loc.gov/catdir/pcc/pccpart02m.html )

 

2.  Department laptop computer needs (Rita Stumps)

 

Rita is looking into obtaining a replacement laptop computer for use by department staff, as the current machine is obsolete.  She listed the applications available on the old machine, and solicited input from staff about what software and functionality are needed for the new laptop.   Those mentioned at this meeting were:  MS Office suite, browsers (Netscape and IE), an email program such as Eudora, floppy and CD-ROM drives, a fax modem, spare battery, hard carrying-case.

 

Please send Rita suggestions via email by next Friday, Feb. 22.  After that date, she will consult with David Leonian of LIS about other requirements and then compile a list of what is needed.  She is also researching costs.  LIS will obtain a machine from one of their suppliers.

 

3.  SCAER Update (John)

 

John gave a detailed report about recent activities of SCAER (Steering Committee on Access to Electronic Resources) and reviewed a white paper he has written entitled, "The Care and Feeding of ERDB."  [included at end of these minutes]

 

The committee has been meeting twice a week for 2 hours.  Members are: John, Dora Loh (co-chairs), Anita Colby, David Yamamoto, and Stephen Schwartz.  (Website: http://staff.library.ucla.edu/staff/committees/scaer/ )  They have established liaisons to other library committees, and Yamamoto is to receive and triage communications from library staff and the public. 

 

Their list of concerns and issues includes 20 items with high priority.  The top 3 are:

--how to refresh the ERDb built last September

--how to add new titles

--guidelines for editing

John's white paper suggests possible strategies for dealing with these issues.

 

Last Thursday, the committee met with Tim Jewell (U. of Washington).  One of Jewell's comments was that Washington’s ER database relies heavily on OPAC data!

 

The rights and responsibilites of being a subject term owner were defined, and these include defining its meaning and subterms, adding and removing the term from e-resources as appropriate, consulting and responding to other term owners, alerting catalogers about e-resources falling into one’s subject area.

 

There was discussion about the keyword field and its uses:

-- for augmentation of subject and title access

-- a web page sub-arrangement device

-- how to float to the top of the list the most important resources (relevancy)

 

SCAER has posted on its website:

-- ERdb Data Input Guideline (required & optional fields) <http://unitproj.library.ucla.edu/biomed/dat/index.html>

-- list of ERdb Subject Liaisons <http://staff.library.ucla.edu/staff/committees/scaer/liaisons.xls>

-- Tim Jewell's powerpoint presentation <http://www.library.ucla.edu/libraries/cataloging/metadata/timjewell.ppt>

 

The committee will meet tomorrow with Sharon Farb to discuss the licensing functions and uses of the ERDB.  John would like users to be able to use it to send messages to each other.

 

On Feb. 27th there will be an open forum for editors and public services librarians.  By June the committee will develop a vision statement for submission to AdCon.

 

John distributed copies of his white paper, and then briefly summarized each of the issues, possible strategies, and the benefits and drawbacks of each.  This paper is due to be discussed in SCAER soon.  John's preferred vision for e-resource access: separate databases only when necessary, and only for as long as necessary.

 

John displayed the current Library home page, and comments were made.  Angela: the heading "Find Research Guides" is misleading; the word "Online" needs to be added.  Louise: links included in the YRL RIS web pages do not appear in the ERdb, and they point to valuable resources.  Rebecca: the research guides by the Bibs need to be integrated into the ERDB somehow.  John questioned the scope of the ERDB: should it be all resources we’ve cataloged, or a selected subset?  Jeff: the users aren't being told or shown where they are going; one cannot trust the web page usage statistics, because they are misleading.  John: government docs and e-books such as NetLibrary materials are not in the ERDB.  Luiz: the link to Orion2 looks insignificant.  Rebecca: users she tested said they went to Melvyl because they didn't trust Orion2.

 

4.  Announcements

 

John distributed copies of an AdCon handout on YCAL fares, which are negotiated airfares offered by UCLA Travel.  They allow one to cancel without penalty; however, they are not always the cheapest fares.  A PAC order is required to pay for a YCAL fare.

 

Next meeting:  Thursday, Feb. 28th from 2-3 pm. (note time change) to discuss the vendor demonstrations by Endeavor (Friday, Feb. 22, 10:15-noon for Cat. and Authorities) and Ex Libris (Thurs., Feb. 28, 10:15-noon). 

 

Respectfully submitted,

 

Louise Ratliff, Recorder


The Care and Feeding of ERDB

John Riemer

February 9, 2002

rev. February 13, 2002

 

 

 

Below are a number of possible strategies for updating the existing content of ERDB and for adding new resources.  They are laid out along a continuum extending from a minimum to a maximum of reliance on data from outside sources such as that in the online catalog.

 

Suggested criteria for evaluating options we have:

Ease of getting records into the system

Degree of duplication of effort involved

Extent to which operations can be automated.

 

 

I.  Possible Strategies for Refreshing ERDB Data Content

 

A.      Stand-alone ERDB.

Consider that the online catalog was a one-time source of seed data for ERDB in September 2001.  Write up general guidelines for manual editing by subject owners and their designees and turn them loose to edit and add records, at least during an initial period of experimentation.

 

Benefits: Provides room to experiment with a new tool for resource discovery. 

Sets aside concern with database synchronization.

 

Drawbacks:

The high degree of duplication of effort in inputting/importing and maintaining data.  The lack of data synchronization in records for e-resources that exist in the two databases.  Not well integrated in Library processes.  Not much in the way of record management tools, resulting in a lot of manual intervention.  Doesn’t take advantage of metadata available elsewhere.

 

B.       ERDB Based on OPAC Data.

For all ERDB records known to have a matching record in Orion, periodically overlay the ERDB fields that support public access with corresponding data from the cataloging record.

The strategy could be further extended by funneling to ERDB the licensing data available from CDL as well as any CDL titles not yet present in ERDB.  Other ERDB titles would be entered and maintained manually.

 

Benefits:

Minimizes duplication of effort.  The strategy of mapping of cataloging record classification numbers to ERDB descriptors makes it possible to make changes to the descriptor list without causing human term assignment to start from scratch—just a revision to the mapping table and re-execution of the mapping. 

Flexibility.  Does not require every ERDB resource to appear in the OPAC.  For those ERDB resources appearing in the OPAC, does not require that they appear in the OPAC first.

 

Drawbacks:

Detailed planning and programming for database synchronization.  The movement of data would be from Orion2 to ERDB, but probably also in the opposite direction. 

 

 

C.      ERDB as Front-End to the OPAC.

All the resources and all the information about them now publicly displayed in ERDB would be contained in the OPAC in such a manner that the user could experience the same look and feel as provided now.

 

Benefits:

The data would all reside in the same place, and we would have a single, comprehensive discovery vehicle.  The programming requirements would be simpler than in B and no greater than in A.  The human effort would be much less than in A and no greater than in B.  Database synchronization would not be an issue at all if there were but one database behind the multiple interfaces.  With a good ILS, we could gain built-in data management tools.  Could position the Library to identify and share records with Open Archives, since the structure and content of records has been well-defined. 

 

Drawbacks:

Dependence on the indexing turnaround time and response time of the underlying database.  Assumes ability in new ILS to define and to specially index the required number of local fields.  Requires all e-resources to reside in the OPAC and to arrive there by the time they are desired to appear in the ERDB interface.  Would add weight to the requirement for “event awareness” and batch deletion of records, which might conflict with other priorities like the need for circulation functionality. 

 

 

II.  Possible Strategies for Getting New Titles into ERDB

 

A.      Direct Manual Input to ERDB

Benefits: Timeliness—the record appears in ERDB as soon as the ‘Save’ button is hit.  Autonomy—an ERDB editor is able to effect this addition without any one else’s assistance.

 

Drawbacks: Manually keying in, or at least performing a copy-and-paste of all the fields one at a time; often having to compose the highly-recommended narrative description.  Difficult to access the ERDB client at home.

 

B.       Retrospective Designation from within Orion

Selectors, aware that a corresponding record already exists in Orion, would use the Taos Cat Client flag that record for inclusion in ERDB in the next data migration.

 

Benefits: Saves selector from keying in all the data manually; just need to consider secondary subject descriptors and desired material types.

 

Drawbacks: Not as instantaneous.  Requires seach of Orion, which currently does not index URL keywords in 856 fields.  Taos Cat Client not typically available off campus.

 

C.      Initiate in CORC

Utilizing web-based access to the e-resource descriptions in OCLC WorldCat, with option to view data elements in a Dublin Core view, selectors could designate records for Orion and/or ERDB.  Catalogers would send the records to ERDB via MARC-export to Orion, or possibly directly to ERDB via DC-XML export.

 

Benefits: Takes advantage of existing resource descriptions, quite often complete with summaries.  Can be done from any web browser, at work or at home.  Through carefully-chosen search parameters, it is possible to achieve the equivalent of an approval plan for free resources.

 

Drawbacks: Not instantaneous; Cataloging is in the habit of hastening access by sending on a preliminary-version of the record when further upgrades appear needed.  Involves search of another database, and sometimes selection from among duplicate records.

 

D.      Load from other data sources

Particularly in the case of record sets, it can represent a real gift to receive from publishers, vendors, bibliographic utilities, etc. a set of records in some standard form.

 

Benefits: Neither the selector nor the cataloger needs to create individual records. 

 

Drawbacks: Programming required to accommodate each data source and format that comes along.  The need to design a maintenance strategy for bulk deletion of the old record set and bulk import of the new set.

 

 

III.  Where to Store ERDB-supporting Data in Cataloging Records

In the case of either I.B or I.C, fields need to be set aside in the cataloging records for the data elements that run ERDB.  Fields in the 9XX block signify local use and this group of MARC tags are customarily protected from overlay when updated versions of the rest of the bibliographic records are received.  Another advantage is that they can be included by selectors who are working in the CORC database

 

Suggestions, subject to confirmation that the particular tags have never been used previously by UCLA:

 

ERDB Subject Descriptor à 950 

ERDB Material Type à 955

Narrative description (summary) à 920

Keywords (to augment retrieval) à 953

Web page subarrangement terms à 954

(All these elements would be repeatable.)

 

Under plan I.C, indexing would need to be separate from that for the rest of the catalog, e.g. the 950 fields, while specially indexed for the ERDB interface, would not be lumped in with the rest of ‘SUBJ’ so as not to interfere with the precision available in Orion searching.

 

The mere presence of these fields in Orion could serve to identify those records with ERDB equivalents, which would need periodic refreshing.  For retrospective designation of other records in Orion for new inclusion in ERDB a flag in one of those 9XX fields, or another one, would trigger class number-to-ERDB descriptor mapping and other migration steps toward inclusion in ERDB.

The flag/trigger could be the word ‘ERDB’ itself in 950.  If a completely separate field is needed, a suggestion is 952.

 

 

IV. Data flows between ERDB and Orion.

 

A.      If plan I.B is adopted, dataflow would logically go in the direction of Orion to ERDB for these elements (repetitively for those asterisked):

 

[E-Serials Migration from Taos to ERDB

Specs for Mapping MARC Fields into ERDB Fields

September 26, 2001]

 

ERDB Field

MARC Field to Migrate (in order of preference)

*Resource Title

130 $a. n, p

If no 130, 245$a. n, p : b

Display Title

Same as Resource Title

*Resource URL

856$u

If multiple URLs, create separate Resource Record for each and generate a report of DBCNs that yielded more than one record

*Coverage

856$3

*Accessibility

Based on 856$z

If text includes <space> UC <space>, set to UC Wide

If text includes <space> UCLA <space>, set to UCLA only

If no text and $x includes CDL, set to UC Wide

Else, set to All

*Author/Editor

100$a

110$a

111$a

If none of the above, 700$a (first 700 only)

*Continued By

785$a. t (first 785 only)

*Continues

780$a (first 780 only)

DBCN

001

*ISBN

020$a (first 020 and first $a only)

*ISSN

022$a (first 022 and first $a only)

*Issued By

710$a

If more than one 710 field, include all separated by semicolons

*Publication Dates

008 date1 and 008 date2

*Publisher

260$b

NOTE:   If a corresponding record in Biomed’s ERdb exists, that publisher entry should replace the publisher from the 260$b

Subject Areas(s)

Map from LC Call Numbers using ACC table

090$a

050$a (if more than one $a, use the last)

060$a (if more than one $a, use the last)

096$a

If multiple subject areas, add each separated by semicolons

If no LC Call Number field, leave Subject Area(s) blank

Type(s)

Electronic Journals  [if Leader/07 = s]

 

B.  At least on a one-time basis, data cited in section III would need to flow from ERDB to Taos.   This would flag those Orion e-resource records that have an ERDB equivalent and that therefore will need recurring data migration.  (Alternatively, is it enough for the mere presence in ERDB of a Taos Database Control Number (DBCN) to support re-migration of data?)

 

C.  Because of the possibility of ERDB editors altering or adding to the subject term(s) resulting from automatic mapping from classification numbers, ERDB-to-Taos data flow will need to occur repetitively.

 

D.  If titles unique to ERDB are desired in Orion, perhaps a data element in ERDB could signify that for a program.  PERL script and MarcMakr could be used to generate provisional records for loading into Taos.  (Alternatively, the recently-generated ‘No-DBCN’ list Andy Kohler created could serve as a notification service for cataloging centers.)

 

 

 

 

V.      Changing the Data Structure of ERDB in Support of Automating the Update of Data Content

 

Re-migrating data to multiple ERDB Resource records from an Orion record that has or had multiple 856 fields poses great difficulty.  How does one coordinate a changed URL in Orion with the ERDB equivalent field it should replace?  How can a 856 field deleted with cause from Orion trigger the same action in ERDB?

 

The duplicative resource titles, specially/artificially qualified in ERDB by aggregator, etc. might not be necessary if we could change the record structure in ERDB.  Is there a way we could clearly present to users a single title as being available from multiple places with different E-holdings?

 

 

VI.  Implications for Cataloging Record Permissions in the Above

A.  The 9XX fields cited in section III need to be open to all ERDB editors who are using the Cat Client. 

B.  All Orion Internet resource records need to be open to editors wanting to add the ‘ERDB’ flag.

C.  The summary (520) and general keyword (653) fields may also need to be open to editing by non-catalogers.

D.  The CORC collaborative cataloging process in II.C might need to become commonplace.