What do CPL branch libraries have in common?

As we finish our omnibus white paper on the project, some maps and visualizations have been augmented to include the latest book seasons and recolored to show the branch clusters and checkouts per book. We’ve also excerpted some of the explanation.



In order to create Figure C, we performed unsupervised clustering of the branches based on their demographic characteristics. We used the Partitioning Around Medioids (PAM) algorithm [Kaufman 1990], which is known to be more robust to noise and outliers compared to the more widely-used k-means algorithm. PAM is more computationally-intensive than k-means, but for our small data set, this was not a significant drawback. We explored various combinations of cluster counts and selected features, using the silhouette metric to discriminate between the different choices. Our final set of five clusters was created using the top eight principal components and had an average silhouette width of 0.3. Figure C shows these clusters, and includes patterns perhaps familiar to students of the segregated history of Chicago. Near-north side neighborhoods with higher property values are found in cluster 5. Surrounding them in cluster 2 are diverse neighborhoods with mostly rental property units. Mostly African-American areas on the south side and west sides of the city are grouped in cluster 4. Hispanic areas are found mostly in cluster 3. Cluster 1 is the “bungalow belt” of older ethnic neighborhoods, now occupied by a diverse mix of residents, distinguished from some of the other areas by a higher rate of home ownership — note the inclusion of Chicago’s Chinatown (CN branch) in this group. Note also that the three regional libraries are treated as a single separate cluster for analysis purposes.

More to come …


Leave a Reply

Your email address will not be published. Required fields are marked *