Spaces:

society-ethics
/

StableBias

Runtime error

App Files Files Community

yjernite commited on Mar 4, 2023

Commit

ac7fd1c

•

1 Parent(s): d11d534

section 2

Browse files

Files changed (1) hide show

app.py +166 -72

app.py CHANGED Viewed

@@ -34,6 +34,20 @@ _ID_CLUSTER_SCREEN_SHOTS = {
     15: ("cluster_15_of_24_woman_white.JPG", "Cluster 15 of 24"),
 }
 def get_images(path):
     images = [Image.open(os.path.join(path, im)) for im in os.listdir(path)]
@@ -41,26 +55,45 @@ def get_images(path):
     return [(im, path) for im, path in zip(images, paths)]
 def show_id_images(cl_id_1, cl_id_2, cl_id_3):
     img_path_1, cluster_name_1 = _ID_CLUSTER_SCREEN_SHOTS[cl_id_1]
     img_path_2, cluster_name_2 = _ID_CLUSTER_SCREEN_SHOTS[cl_id_2]
     img_path_3, cluster_name_3 = _ID_CLUSTER_SCREEN_SHOTS[cl_id_3]
     return (
         gr.update(
-            value=os.path.join(impath, img_path_1),
             label=f"Screenshot of the Identity Exploration tool for: {cluster_name_1}",
         ),
         gr.update(
-            value=os.path.join(impath, img_path_2),
             label=f"Screenshot of the Identity Exploration tool for: {cluster_name_2}",
         ),
         gr.update(
-            value=os.path.join(impath, img_path_3),
             label=f"Screenshot of the Identity Exploration tool for: {cluster_name_3}",
         ),
     )
 with gr.Blocks() as demo:
     gr.Markdown(
         """
@@ -140,7 +173,6 @@ with gr.Blocks() as demo:
             to showcase the visual trends encoded in these clusters - as well as their relation to the social variables under consideration.
             """
         )
-        impath = "images/identities"
         with gr.Row():
             with gr.Column(scale=1):
                 gr.Markdown(
@@ -172,7 +204,7 @@ with gr.Blocks() as demo:
                     show_label=False,
                 )
                 identity_screenshot_1 = gr.Image(
-                    value=os.path.join(impath, "cluster_2_of_24_latinx_woman.JPG"),
                     label="Screenshot of the Identity Exploration tool for: Cluster 2 of 24",
                 )
         with gr.Row():
@@ -184,7 +216,7 @@ with gr.Blocks() as demo:
                 )
                 identity_screenshot_2 = gr.Image(
                     value=os.path.join(
-                        impath, "cluster_3_of_24_native_american_stereetotypical.JPG"
                     ),
                     label="Screenshot of the Identity Exploration tool for: Cluster 3 of 24",
                 )
@@ -226,7 +258,7 @@ with gr.Blocks() as demo:
                     The clusters with the most examples of prompts with unspecified gender and unspecified ethnicity terms are **clusters 5 and 19**,
                     and both are also strongly associated with the words *man*, *White*, and *Causian*.
                     This association holds across genders (as showcased by **cluster 15**, which has a majority of *woman* and *White* prompts along with unspecified ethnicity)
-                    and across ethnicities (comparing the proportions of unspecified genders in **clusters 0 and 6**: 18 % and 38% for the clusters with more *aoman* and more *man* respectively along with the *African American* phrases).
                     This provides the beginning of an answer to our motivating question: since users rarely specify an explicit gender or ethnicity when using
                     these systems to generate images of people, the high likelihood of defaulting to *Whiteness* and *masculinity* is likely to at least partially explain the observed lack of diversity in the outputs.
@@ -241,10 +273,19 @@ with gr.Blocks() as demo:
                 )
                 identity_screenshot_3 = gr.Image(
                     value=os.path.join(
-                        impath, "cluster_19_of_24_unmarked_white_unmarked_man.JPG"
                     ),
                     label="Screenshot of the Identity Exploration tool for: Cluster 19 of 24",
                 )
         for var in [id_cl_id_1, id_cl_id_2, id_cl_id_3]:
             var.change(
                 show_id_images,
@@ -260,37 +301,76 @@ with gr.Blocks() as demo:
         """
         ### Quantifying Social Biases in Image Generations: Professions
-        Machine Learning models encode and amplify biases that are represented in the data that they are trained on -
-        this can include, for instance, stereotypes around the appearances of members of different professions.
-        In our study, we prompted the 3 text-to-image models with texts pertaining to 150 different professions
-        and analyzed the presence of different identity groups in the images generated. We found evidence of many societal stereotypes in the images generated,
-        such as the fact that people in positions of power (e.g. director, CEO) are often White- and male-appearing,
-        while the images generated for other professions are more diverse.
-        Read more about our findings in the accordion below or directly via the [Diffusion Cluster Explorer](https://hf.co/spaces/society-ethics/DiffusionClustering) tool.
         """
     )
     with gr.Accordion("Quantifying Social Biases in Image Generations: Professions", open=False):
         gr.Markdown(
             """
             <br/>
-            We also explore the correlations between the professions that use used in our prompts and the different identity clusters that we identified.
-            Using the [Profession Bias Tool](https://hf.co/spaces/society-ethics/DiffusionClustering)
-            we can see which clusters are most correlated with each profession and what identities are in these clusters.
             """
         )
-        impath = "images/bias"
         with gr.Row():
             with gr.Column(scale=1):
                 gr.Markdown(
                     """
                     #### [Diversity and Representation across Models](https://hf.co/spaces/society-ethics/DiffusionClustering "you can cycle through screenshots of the tool in use on the right, or go straight to the interactive demo")
-                    Using the **[Profession Bias Tool](https://hf.co/spaces/society-ethics/DiffusionClustering)**,
-                    we can see that the top cluster for the CEO and director professions is **Cluster 4**:
-                    We can see that the most represented gender term is *man* (56%  of the cluster) and *White* (29% of the cluster).
-                    This is consistent with common stereotypes regarding people in positions of power, who are predominantly male, according to the US Labor Bureau Statistics.
                     """
                 )
             with gr.Column(scale=1):
@@ -298,15 +378,17 @@ with gr.Blocks() as demo:
                     choices=[
                         "Results table: all models",
                         "Results table: Stable Diffusion v1.4",
-                        "Results table: Stable Diffusion v2,",
                         "Results table: Stable Diffusion Dall-E 2",
                         "Comparison histogram: all professions",
                     ],
                     value="Results table: all models",
                     show_label=False,
                 )
                 bias_screenshot_1 = gr.Image(
-                    value=os.path.join(impath, "cluster_assign_24_all.png"),
                     label="Screenshot of the Profession Bias Tool | Results table: all models",
                 )
         with gr.Row():
@@ -322,28 +404,28 @@ with gr.Blocks() as demo:
                     show_label=False,
                 )
                 bias_screenshot_2 = gr.Image(
-                    value=os.path.join(impath, "cluster_assign_mental_health_24_all.png"),
                     label="Screenshot of the Profession Bias Tool | Results table: mental health professions, all models",
                 )
-                mental_helth_examlpars = gr.Gallery(
                     [
-                        (Image.open(os.path.join(impath, im)), name)
                         for im, name in [
-                            ("social_assistant_2_of_24.png", "Generated images of 'social assistant' assigned to cluster 2 of 24"),
-                            ("social_assistant_5_of_24.png", "Generated images of 'social assistant' assigned to cluster 5 of 24"),
-                            ("social_assistant_15_of_24.png", "Generated images of 'social assistant' assigned to cluster 15 of 24"),
-                            ("social_assistant_19_of_24.png", "Generated images of 'social assistant' assigned to cluster 19 of 24"),
-                            ("social_assistant_0_of_24.png", "Generated images of 'social assistant' assigned to cluster 0 of 24"),
-                            ("social_worker_2_of_24.png", "Generated images of 'social worker' assigned to cluster 2 of 24"),
-                            ("social_worker_5_of_24.png", "Generated images of 'social worker' assigned to cluster 5 of 24"),
-                            ("social_worker_15_of_24.png", "Generated images of 'social worker' assigned to cluster 15 of 24"),
-                            ("social_worker_19_of_24.png", "Generated images of 'social worker' assigned to cluster 19 of 24"),
-                            ("social_worker_0_of_24.png", "Generated images of 'social worker' assigned to cluster 0 of 24"),
-                            ("psychologist_2_of_24.png", "Generated images of 'psychologists' assigned to cluster 2 of 24"),
-                            ("psychologist_5_of_24.png", "Generated images of 'psychologists' assigned to cluster 5 of 24"),
-                            ("psychologist_15_of_24.png", "Generated images of 'psychologists' assigned to cluster 15 of 24"),
-                            ("psychologist_19_of_24.png", "Generated images of 'psychologists' assigned to cluster 19 of 24"),
-                            ("psychologist_0_of_24.png", "Generated images of 'psychologists' assigned to cluster 0 of 24"),
                         ]
                     ],
                     label="Example images generated by three text-to-image models (Dall-E 2, Stable Diffusion v1.4 and v.2)",
@@ -354,39 +436,51 @@ with gr.Blocks() as demo:
                     """
                     #### [Focused Comparison: Mental Health Professions](https://hf.co/spaces/society-ethics/DiffusionClustering "you can cycle through screenshots of the tool in use on the left and example images below, or go straight to the interactive demo")
-                    If we look at the cluster representation of professions such as social assistant and social worker,
-                    we can observe that the former is best represented by **Cluster 2**, whereas the latter has a more uniform representation across multiple clusters:
-                    Cluster 2 is best represented by the gender term is *woman* (81%) as well as *Latinx* (19%).
-                    This gender proportion is exactly the same as the one provided by the United States Labor Bureau (which you can see in the table above), with 81% of social assistants identifying as women.
                     """
                 )
-        if False:
-#        with gr.Row():
-            mental_helth_examlpars = gr.Gallery(
-                [
-                    (Image.open(os.path.join(impath, im)), name)
-                    for im, name in [
-                        ("social_assistant_0_of_24.png", "Generated images of 'social assistant' assigned to cluster 0 of 24"),
-                        ("social_assistant_2_of_24.png", "Generated images of 'social assistant' assigned to cluster 2 of 24"),
-                        ("social_assistant_5_of_24.png", "Generated images of 'social assistant' assigned to cluster 5 of 24"),
-                        ("social_assistant_0_of_24.png", "Generated images of 'social assistant' assigned to cluster 0 of 24"),
-                        ("social_assistant_0_of_24.png", "Generated images of 'social assistant' assigned to cluster 0 of 24"),
-                        ("social_worker_0_of_24.png", "Generated images of 'social worker' assigned to cluster 0 of 24"),
-                        ("social_worker_2_of_24.png", "Generated images of 'social worker' assigned to cluster 2 of 24"),
-                        ("social_worker_5_of_24.png", "Generated images of 'social worker' assigned to cluster 5 of 24"),
-                        ("social_worker_0_of_24.png", "Generated images of 'social worker' assigned to cluster 0 of 24"),
-                        ("social_worker_0_of_24.png", "Generated images of 'social worker' assigned to cluster 0 of 24"),
-                        ("psychologist_0_of_24.png", "Generated images of 'psychologists' assigned to cluster 0 of 24"),
-                        ("psychologist_2_of_24.png", "Generated images of 'psychologists' assigned to cluster 2 of 24"),
-                        ("psychologist_5_of_24.png", "Generated images of 'psychologists' assigned to cluster 5 of 24"),
-                        ("psychologist_0_of_24.png", "Generated images of 'psychologists' assigned to cluster 0 of 24"),
-                        ("psychologist_0_of_24.png", "Generated images of 'psychologists' assigned to cluster 0 of 24"),
-                    ]
                 ],
-                label="Example images generated by three text-to-image models (Dall-E 2, Stable Diffusion v1.4 and v.2).",
-                show_label=False,
-            ).style(grid=[3, 5], height="auto")
     gr.Markdown(
         """

     15: ("cluster_15_of_24_woman_white.JPG", "Cluster 15 of 24"),
 }
+_BIAS_STATS_SCREEN_SHOTS = {
+    "Results table: all models": "cluster_assign_24_all.png",
+    "Results table: Stable Diffusion v1.4": "cluster_assign_24_sd14.png",
+    "Results table: Stable Diffusion v2.": "cluster_assign_24_sd2.png",
+    "Results table: Stable Diffusion Dall-E 2": "cluster_assign_24_dalle.png",
+    "Comparison histogram: all professions": "all_profs_histo_24.png",
+    "CEO examplars: Cluster 5": "ceo_5_of_24.png",
+    "CEO examplars: Cluster 6": "ceo_6_of_24.png",
+    "Results table: mental health professions, all models": "cluster_assign_mental_health_24_all.png",
+    "Comparison histogram: psychologist": "psychologist_histo_24.png",
+    "Comparison histogram: social worker": "social_worker_histo_24.png",
+    "Comparison histogram: social assistant": "social_assistant_histo_24.png",
+}
 def get_images(path):
     images = [Image.open(os.path.join(path, im)) for im in os.listdir(path)]
     return [(im, path) for im, path in zip(images, paths)]
+impath_id = "images/identities"
 def show_id_images(cl_id_1, cl_id_2, cl_id_3):
     img_path_1, cluster_name_1 = _ID_CLUSTER_SCREEN_SHOTS[cl_id_1]
     img_path_2, cluster_name_2 = _ID_CLUSTER_SCREEN_SHOTS[cl_id_2]
     img_path_3, cluster_name_3 = _ID_CLUSTER_SCREEN_SHOTS[cl_id_3]
     return (
         gr.update(
+            value=os.path.join(impath_id, img_path_1),
             label=f"Screenshot of the Identity Exploration tool for: {cluster_name_1}",
         ),
         gr.update(
+            value=os.path.join(impath_id, img_path_2),
             label=f"Screenshot of the Identity Exploration tool for: {cluster_name_2}",
         ),
         gr.update(
+            value=os.path.join(impath_id, img_path_3),
             label=f"Screenshot of the Identity Exploration tool for: {cluster_name_3}",
         ),
     )
+impath_bias = "images/bias"
+def show_bias_images(screen_id_1, screen_id_2):
+    img_path_1 = _BIAS_STATS_SCREEN_SHOTS[screen_id_1]
+    img_path_2 = _BIAS_STATS_SCREEN_SHOTS[screen_id_2]
+    return (
+        gr.update(
+            value=os.path.join(impath_bias, img_path_1),
+            label=f"Screenshot of the Profession Bias Tool | {screen_id_1}",
+        ),
+        gr.update(
+            value=os.path.join(impath_bias, img_path_2),
+            label=f"Screenshot of the Profession Bias Tool | {screen_id_2}",
+        ),
+    )
 with gr.Blocks() as demo:
     gr.Markdown(
         """
             to showcase the visual trends encoded in these clusters - as well as their relation to the social variables under consideration.
             """
         )
         with gr.Row():
             with gr.Column(scale=1):
                 gr.Markdown(
                     show_label=False,
                 )
                 identity_screenshot_1 = gr.Image(
+                    value=os.path.join(impath_id, "cluster_2_of_24_latinx_woman.JPG"),
                     label="Screenshot of the Identity Exploration tool for: Cluster 2 of 24",
                 )
         with gr.Row():
                 )
                 identity_screenshot_2 = gr.Image(
                     value=os.path.join(
+                        impath_id, "cluster_3_of_24_native_american_stereetotypical.JPG"
                     ),
                     label="Screenshot of the Identity Exploration tool for: Cluster 3 of 24",
                 )
                     The clusters with the most examples of prompts with unspecified gender and unspecified ethnicity terms are **clusters 5 and 19**,
                     and both are also strongly associated with the words *man*, *White*, and *Causian*.
                     This association holds across genders (as showcased by **cluster 15**, which has a majority of *woman* and *White* prompts along with unspecified ethnicity)
+                    and across ethnicities (comparing the proportions of unspecified genders in **clusters 0 and 6**: 18 % and 38% for the clusters with more *woman* and more *man* respectively along with the *African American* phrases).
                     This provides the beginning of an answer to our motivating question: since users rarely specify an explicit gender or ethnicity when using
                     these systems to generate images of people, the high likelihood of defaulting to *Whiteness* and *masculinity* is likely to at least partially explain the observed lack of diversity in the outputs.
                 )
                 identity_screenshot_3 = gr.Image(
                     value=os.path.join(
+                        impath_id, "cluster_19_of_24_unmarked_white_unmarked_man.JPG"
                     ),
                     label="Screenshot of the Identity Exploration tool for: Cluster 19 of 24",
                 )
+        demo.load(
+            show_id_images,
+            inputs=[id_cl_id_1, id_cl_id_2, id_cl_id_3],
+            outputs=[
+                identity_screenshot_1,
+                identity_screenshot_2,
+                identity_screenshot_3,
+            ],
+        )
         for var in [id_cl_id_1, id_cl_id_2, id_cl_id_3]:
             var.change(
                 show_id_images,
         """
         ### Quantifying Social Biases in Image Generations: Professions
+        In the previous section, we provided a method to characterize how text-to-image systems
+        associate textual mentions of identity characteristics with visual features in their outputs.
+        Exploring the inherent representations learned through our method already gave us insights into some
+        of the bias dynamics embedded in the models that support these systems.
+        We take the analysis further in the present section by leveraging these represenations to study **social biases tied to a particular application
+        setting:** specifically, we consider a setting where users generate pictures of people in professional settings, for example to use in place of
+        stock images in articles or websites.
+        To that end, we systematically generate a large variety of images for 150 professions for each of 3 systems,
+        and compare the distribution of the visual features identified in the [Identity Representation Demo](https://hf.co/spaces/society-ethics/DiffusionFaceClustering)
+        across all generations and across generations for each profession.
+        You can explore these results in detail in the companion [Profession Bias Tool](https://hf.co/spaces/society-ethics/DiffusionClustering) -
+        in particular, you can read about the resulting **diversity metric** for different models and professions,
+        as well as a study case for **comparing the representations of different mental health professions**, by expanding the accordion below:
         """
     )
     with gr.Accordion("Quantifying Social Biases in Image Generations: Professions", open=False):
         gr.Markdown(
             """
             <br/>
+            Machine Learning models encode and amplify biases that are represented in the data that they are trained on -
+            this can include, for instance, stereotypes around the demographic makeup of different professions.
+            When the models are used in image generation systems that then contibute to the distribution of web imaged through integration
+            in image edition software or stock imagery services, these representation biases can then feed into complex sociotechnical systems.
+            For example, previous work has shown that seeing less diverse image search engine results can negatively impact under-represented groups' sense of belonging in their job,
+            or reinforce implicit biases in hiring that then lead to further exclusion and in turn again to less diverse representation.
+            As previously mentioned, measuring the extent of those biases in model outputs is complicated by the lack of inherent social charcteristics
+            for the synthetic people depicted in the generated images. In this section, we show how to leverage the [identity clusters](https://hf.co/spaces/society-ethics/DiffusionFaceClustering)
+            introduced in the previous section to address this difficulty.
+            For each of the 3 image generation systems, we generate a dataset of images corresponding to prompts of the format *"Photo portrait of a **(adjective)** **[profession]**"*.
+            The adjectives serve both as a source of additional variability and as the focus of their own analysis (see our paper).
+            We use a list of 20 adjectives and 150 professions from the [US Bureau of Labor Statistics (BLS)](https://www.bls.gov/cps/cpsaat11.htm).
+            We then assign each image to a cluster based on a dot product between the image embedding and cluster centroid.
+            This allows us to answer our motivating question about quantifying whether images depicting a certain professions are **more likely to look like** images
+            corresponding to prompts explicitly mentioning specific genders or ethnicities (for example, mostly *man* and mostly *White* for images in cluster 5)
+            without assigning an identity charasteristic to an individual generation for *e.g.* a *"Photo portrait of a compassionate CEO"* prompt.
+            The results are presented in both tabs of the [Profession Bias Tool](https://hf.co/spaces/society-ethics/DiffusionClustering).
+            The **Professions Overview** tab lets users select a system (or *All Models*) and a subset of professions (or *all professions* together)
+            to print a table showing its distribution over the top 8 identity clusters, its diversity as measured by the entropy of this distribution,
+            and the gender ratio for this profession in the US as reported by the [BLS](https://www.bls.gov/cps/cpsaat11.htm "specifically, the reported proportion of women").
+            It also provides a summary description of the clusters for convenience.
+            The **Profession Focus** tab lets users select a single profession and compares the distribution across identity clusters for all image generation systems in a histogram.
+            It also provides examples of images generated for the profession that are assigned to each of the clusters.
             """
         )
         with gr.Row():
             with gr.Column(scale=1):
                 gr.Markdown(
                     """
                     #### [Diversity and Representation across Models](https://hf.co/spaces/society-ethics/DiffusionClustering "you can cycle through screenshots of the tool in use on the right, or go straight to the interactive demo")
+                    We start by looking at the summary statistics provided in the [**Professions Overview** tab](https://hf.co/spaces/society-ethics/DiffusionClustering "select screenshots right or go straight to the demo").
+                    **Clusters 5 and 19**, which are mostly made up of images generated for prompts mentioning *man*, *White*, and *Caucasian*,
+                    are the most represented across systems and professions, accounting for 53.6 % of generations.
+                    This is over-representation (less than 31% of respondents to the 2020 US census checked the boxes for *man* and *White*) is not equally distributed however:
+                    these two cluster make up 69.7% of the *CEO* images, but only 35.7 and 22.1% for *fast food worker* and *social worker* respectively.
+                    Next, compare the tables for systems Stable Diffusion v1.4, v2, and Dall-E 2.
+                    These same clusters are somewhat less over-represented in Stable Siffusion v1.4, accounting for 39.6% of generations across professions and 43.3% for *CEO*,
+                    but Dall-E 2 shows the strongest disparities with 68.7 and 87.7 for all professions and *CEO* respectively.
+                    Stable Diffusion v2 falls in the middle (52.4 all, 78.1 *CEO*).
+                    The same ordering is reflected in the entropy-based diversity metric, with Stable Diffusion v1.4 measured as the most diverse (2.2)
+                    followed by v2 (1.9), and Dall-E 2 (1.7).
+                    These three systems correspond to different approaches and design choices, especially in terms of filtering of the pre-training data for Stable Diffusion v2 and Dall-E 2,
+                    guided by concerns of performance (esthetic value of the images, relevance of the image to the prompt) and risks of generating unwanted content
+                    (usually with a focus on sexual themes and violence). While these two aspects are undoubtedly important,
+                    the results we present here suggest that much more care needs to be taken in ensuring that these intervention do not exacerbate social dynamics that further erase marginalized populations.
                     """
                 )
             with gr.Column(scale=1):
                     choices=[
                         "Results table: all models",
                         "Results table: Stable Diffusion v1.4",
+                        "Results table: Stable Diffusion v2.",
                         "Results table: Stable Diffusion Dall-E 2",
                         "Comparison histogram: all professions",
+                        "CEO examplars: Cluster 5",
+                        "CEO examplars: Cluster 6",
                     ],
                     value="Results table: all models",
                     show_label=False,
                 )
                 bias_screenshot_1 = gr.Image(
+                    value=os.path.join(impath_bias, "cluster_assign_24_all.png"),
                     label="Screenshot of the Profession Bias Tool | Results table: all models",
                 )
         with gr.Row():
                     show_label=False,
                 )
                 bias_screenshot_2 = gr.Image(
+                    value=os.path.join(impath_bias, "cluster_assign_mental_health_24_all.png"),
                     label="Screenshot of the Profession Bias Tool | Results table: mental health professions, all models",
                 )
+                mental_health_examlpars = gr.Gallery(
                     [
+                        (Image.open(os.path.join(impath_bias, im)), name)
                         for im, name in [
+                            ("psychologist_2_of_24.png", "2 - psychologists"),
+                            ("psychologist_5_of_24.png", "5 - psychologists"),
+                            ("psychologist_15_of_24.png", "15 - psychologists"),
+                            ("psychologist_19_of_24.png", "19 - psychologists"),
+                            ("psychologist_0_of_24.png", "0 - psychologists"),
+                            ("social_assistant_2_of_24.png", "2 - social assistant"),
+                            ("social_assistant_5_of_24.png", "5 - social assistant"),
+                            ("social_assistant_15_0f_24.png", "15 - social assistant"),
+                            ("social_assistant_19_of_24.png", "19 - social assistant"),
+                            ("social_assistant_0_of_24.png", "0 - social assistant"),
+                            ("social_worker_2_of_24.png", "2 - social worker"),
+                            ("social_worker_5_of_24.png", "5 - social worker"),
+                            ("social_worker_15_of_24.png", "15 - social worker"),
+                            ("social_worker_19_of_24.png", "19 - social worker"),
+                            ("social_worker_0_of_24.png", "0 - social worker"),
                         ]
                     ],
                     label="Example images generated by three text-to-image models (Dall-E 2, Stable Diffusion v1.4 and v.2)",
                     """
                     #### [Focused Comparison: Mental Health Professions](https://hf.co/spaces/society-ethics/DiffusionClustering "you can cycle through screenshots of the tool in use on the left and example images below, or go straight to the interactive demo")
+                    We can also leverage the [Profession Bias Tool](https://hf.co/spaces/society-ethics/DiffusionClustering) for more focused analysis.
+                    For this case study, we use it to compare the distribution over identity clusters for three professions related to mental health and care:
+                    [social assistant](https://www.bls.gov/ooh/community-and-social-service/social-and-human-service-assistants.htm),
+                    [social worker](https://www.bls.gov/ooh/community-and-social-service/social-workers.htm), and [psychologist](https://www.bls.gov/ooh/life-physical-and-social-science/psychologists.htm).
+                    We can see that the BLS reports a significant majority of women for all three professions, from 74.9 to 83.6 percent of the workforce.
+                    However, the BLS also reports different degree requirements and significant differences in median income,
+                    with *social assistants* at **18$/hour**, *social workers* at **24$/hour**, and *psychologists* at **39$/hour**.
+                    Going back to the proportions of clusters 5 and 19, we find that this corresponds to clusters associated with the terms *White* and *man*
+                    accounting for **10.5%**, **22.1%**, and **49.8%** of generated images respectively for the prompts mentioning all three professions.
+                    These findings underline the urgency of also **including socioeconomic factors in fairness evaluations.**
+                    The **Profession Focus** tab provides further information to help interpret these results.
+                    First, the comparative histograms (screenshots to the left) show how the trends hold across models.
+                    Generations from all models follow the ordering outlined above, although we do see differences in the distribution for the *social worker* prompts
+                    between the Stable Diffusion v.2. and Dall-E 2 systems.
+                    Second, we show examplars of image generations for each professions that are assigned to each of the identity clusters (screenshot to the left, below).
+                    This serves the dual purpose of confirming our intuitions about the visual features encoded in the cluster assignments and of making it easier to find examples
+                    of profession image generations that are likely to showcase specific features.
+                    For example, we see in the table that **cluster 0**, which is made up primarily of prompts featuring the phrases *woman* and *African American*,
+                    is better represented for profession images generated for *social worker* prompts - the tool lets us visually inspect the images for this profession assigned to that cluster.
+                    ***Note on the cluster examplar selection:*** In order to also show the limitations of the cluster assignment, we show the three images that are
+                    closest to the cluster centroid, two close to the median distance, and finally the three images that are assigned to the cluster with the least confidence.
+                    This shows atypical generations often with text printed (or something that looks like text), black and white pictures, pictures with multiple faces, and, in one case, a dog.
+                    Given the stochastic nature of the generation, finding surprising generations across the 90,000+ images we generated is not wholly unexpected.
+                    We provide more details on how we identified outliers across the image generations in the section entitled **Exploring the Pixel Space of Generated Images**.
                     """
                 )
+        demo.load(
+            show_bias_images,
+                inputs=[bias_cl_id_1, bias_cl_id_2],
+                outputs=[
+                    bias_screenshot_1,
+                    bias_screenshot_2,
                 ],
+        )
+        for var in [bias_cl_id_1, bias_cl_id_2]:
+            var.change(
+                show_bias_images,
+                inputs=[bias_cl_id_1, bias_cl_id_2],
+                outputs=[
+                    bias_screenshot_1,
+                    bias_screenshot_2,
+                ],
+            )
     gr.Markdown(
         """