Workshop on Clustering Large Data Sets
November 19, 2003
Melbourne, Florida

To be Held in Conjunction with the
Third IEEE International Conference on Data Mining (ICDM 2003)


Theme Statement | Topics of interest | Submission Requirements
Important Dates | Schedule | Program Committee | Organizing Committee

Theme Statement

Applications in various domains often lead to very large and frequently high-dimensional data; the dimension of the data being in the hundreds or thousands, for example in text/web mining and bioinformatics. In addition to the high dimensionality, these data sets are also often sparse. Clustering such large and high-dimensional data sets is a contemporary challenge. Successful algorithms must avoid the curse of dimensionality but at the same time should be computationally efficient.

A one-day workshop on Clustering Large Data Sets is being held in conjunction with ICDM 2003 in Melbourne, Florida (November '03) to bring together researchers to present their current approaches and results in clustering large data sets that arise in various applications. Particular areas of interest are text mining, clustering of bio-informatics data, online clustering, market-basket and web log data.

Topics of interest include:

Submission Requirements

Original papers on clustering large and high-dimensional data are solicited. For consideration, send an electronic submission (postscript or PDF versions printable on 8.5 x 11 paper only) to Jacob Kogan:; phone: (410)-455-3297; fax: (410)-455-1066.

An email including the title, authors and abstract of the paper should be sent separately in plain ASCII format (no HTML-tags please).

To guarantee consideration, manuscripts must be received by September 10, 2003 (the final manuscript must be no more than 10 pages). Submission of work in progress is also encouraged.

All accepted papers whose camera-ready copies are received by the October 15, 2003 deadline (see below) will be distributed as photocopied proceedings available at the conference for purchase by attendees (Latex style file available here).

Important Dates

Papers Due:
September 10th, 2003

Notification of Acceptance:
October 5th, 2003

Camera ready:
October 15th, 2003

November 19th, 2003

Workshop Schedule

Program Committee

Cliff Behrens, Telcordia Technologies
Pavel Berkhin, Yahoo
Alex Bolshoy, Genome Diversity Center
Paul Bradley, Bradley Data Consulting, LLC
Moses Charikar, Princeton University
Chris Ding, Lawrence Berkeley National Lab
Chris Fraley, University of Washington
Thomas Hoffman, Brown University
George Karypis, University of Minnesota
Shailesh Kumar, HNC
Arie Leizarowitz, Technion, Israel
Dharmendra Modha, IBM Almaden Research Center
Amit Sahai, Princeton University
Nick Street, University of Iowa
Mark Teboulle, Tel-Aviv University
Zeev (Vladimir) Volkovich, Ort Braude College, Israel
Shi Zhong, Florida Atlantic University
Luba Zlatin, C-Ark, Israel

Organizing Committee

Daniel Boley
Department of Computer Science & Engineering
University of Minnesota
Minneapolis, MN 55455
Phone: (612) 625-3887
Fax: (612) 625-0572

Inderjit Dhillon
Department of Computer Science
University of Texas
Austin, TX 78712-1188
Phone: (512) 471-9725
Fax: (512) 471-8885

Joydeep Ghosh
Department of Electrical & Computer Engineering
Department of Computer Sciences
University of Texas
Austin, TX 78712-1188
Phone: (512) 471-8980
Fax: (512) 471-2893

Jacob Kogan
Department of Mathematics & Statistics
Department of Computer Science & Electrical Engineering
Univ. of Maryland, Baltimore County
Baltimore, MD 21250
Phone: (410) 455-3297
Fax: (410) 455-1066

Last modified on October 20, 2003.