topicmodeldiscovery

Topic Modeling as a Tool for Resource Discovery

View the Project on GitHub efkuehn/topicmodeldiscovery

Find the Dominant Document For Each Topic

Once we have a topic model that has pretty good distribution and the bags of words have fairly coherent topics, we needed to explore the specific topics in the corpus. To do this, we created a pandas dataframe that works similarly to a spreadsheet, but allows all of the functionality of python on top of it.

The firest two cells import the necessary modules, and load the data.

import pandas as pd
import json
from gensim import corpora 
from gensim.models.ldamodel import LdaModel 
from gensim.corpora.dictionary import Dictionary
lda_model = LdaModel.load('./models/PrelimTOpicModel2') 
corpus_dict = Dictionary.load_from_text('./models/corpus_dictionary_2')
with open('./models/corpus.json', 'r') as fp:
    corpus = json.load(fp)
with open('./models/text_list.json', 'r') as fp:
    text_list = json.load(fp)
with open('./models/corpus_list.json', 'r') as fp:
    corpus_list = json.load(fp)

The following code is the primary function that creates the dataframe. This dataframe has a row for each page in the document. Which topic is dominant for the words on the page, and what the distinctive words are for the given topic. It also includes the pdf and page number for the document we are analyzing.

This allowed us to go back and look at the page for further context, in order to better understand the topics.

# this creates a pandas DataFrame that orders all of the topics and shows the dominant topic for each document
def format_topics_sent(ldamodel, corpus, texts):
    sent_topics_df = pd.DataFrame()
    for i, row in enumerate(ldamodel[corpus]):
        row = sorted(row[0], key=lambda x: x[1], reverse=True)
        
        for j, (topic_num, prop_topic) in enumerate(row):
            if j == 0:
                wp = ldamodel.show_topic(topic_num)
                topic_keywords = ", ".join([word for word, prop in wp])
                sent_topics_df = sent_topics_df.append(pd.Series([int(topic_num), round(prop_topic,4), topic_keywords]), ignore_index=True)
            else:
                break
    sent_topics_df.columns = ['Dominant_topic', 'Perc_Contrib', 'Topic_Keywords']
    contents = pd.Series(texts)
    sent_topics_df = pd.concat([sent_topics_df, contents], axis=1)
    sent_topics_df.rename(columns={0: "Text"}, inplace=True)
    return sent_topics_df

Exploring the Dominant Topic Models

In order to better understand the specifics of this code, we can explore each particular row, by creating a generator to look at the rows.

def format_topics_sent_gen(ldamodel, corpus, texts):
    for i, row in enumerate(ldamodel[corpus]):
        yield row
row_generator = format_topics_sent_gen(lda_model, corpus, corpus_list)
row = next(row_generator)
row
([(0, 0.010000001),
  (1, 0.010000001),
  (2, 0.010000001),
  (3, 0.010000001),
  (4, 0.010000001),
  (5, 0.010000001),
  (6, 0.010000001),
  (7, 0.010000001),
  (8, 0.010000001),
  (9, 0.76),
  (10, 0.010000001),
  (11, 0.010000001),
  (12, 0.010000001),
  (13, 0.010000001),
  (14, 0.010000001),
  (15, 0.010000001),
  (16, 0.010000001),
  (17, 0.010000001),
  (18, 0.010000001),
  (19, 0.010000001),
  (20, 0.010000001),
  (21, 0.010000001),
  (22, 0.010000001),
  (23, 0.010000001),
  (24, 0.010000001)],
 [(0, [9])],
 [(0, [(9, 2.9999995)])])

For looking at the details of a specific topic, and its word distribution, you can query the lda_model directly. The topn variable shows how many items to display

lda_model.show_topic(21, topn=30)
[('god', 0.20524834),
 ('people', 0.08997392),
 ('power', 0.0812579),
 ('faith', 0.057230312),
 ('christian', 0.05547319),
 ('life', 0.05419738),
 ('word', 0.049899396),
 ('world', 0.045320693),
 ('way', 0.031713385),
 ('human', 0.030553361),
 ('reality', 0.025296446),
 ('experience', 0.023166098),
 ('doe', 0.021962931),
 ('tulud', 0.019804804),
 ('need', 0.016158376),
 ('especially', 0.013886022),
 ('like', 0.01281783),
 ('sense', 0.012462943),
 ('particularly', 0.011948879),
 ('fact', 0.011376779),
 ('just', 0.01132918),
 ('make', 0.011165283),
 ('time', 0.0108135445),
 ('g', 0.010585245),
 ('relation', 0.009754408),
 ('good', 0.00956607),
 ('example', 0.009151671),
 ('culture', 0.008596516),
 ('context', 0.008572249),
 ('challenge', 0.007773363)]
sent_topics_df = format_topics_sent(lda_model, corpus, text_list)
sent_topics_df
Dominant_topic Perc_Contrib Topic_Keywords Text
0 9.0 0.7600 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Davidson 2018.pdf, 0]
1 9.0 0.6080 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Davidson 2018.pdf, 1]
2 9.0 0.6800 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Davidson 2018.pdf, 2]
3 21.0 0.5200 god, people, power, faith, christian, life, wo... [../pdfs/Davidson 2018.pdf, 3]
4 9.0 0.6040 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Davidson 2018.pdf, 4]
5 9.0 0.8629 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Davidson 2018.pdf, 5]
6 9.0 0.8080 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Davidson 2018.pdf, 6]
7 14.0 0.5200 right, human, word, reality, state, world, tim... [../pdfs/Davidson 2018.pdf, 7]
8 0.0 0.0400 black, experience, life, mean, like, make, poi... [../pdfs/Davidson 2018.pdf, 8]
9 9.0 0.6800 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Davidson 2018.pdf, 9]
10 9.0 0.8400 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Davidson 2018.pdf, 10]
11 9.0 0.7600 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Davidson 2018.pdf, 11]
12 0.0 0.0400 black, experience, life, mean, like, make, poi... [../pdfs/Davidson 2018.pdf, 12]
13 17.0 0.5200 research, form, study, mean, need, way, order,... [../pdfs/Davidson 2018.pdf, 13]
14 9.0 0.5200 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Davidson 2018.pdf, 14]
15 5.0 0.5200 social, political, economic, immigrant, societ... [../pdfs/Davidson 2018.pdf, 15]
16 13.0 0.4080 new, press, ed, york, power, study, global, pe... [../pdfs/Davidson 2018.pdf, 16]
17 14.0 0.4080 right, human, word, reality, state, world, tim... [../pdfs/Davidson 2018.pdf, 17]
18 0.0 0.0400 black, experience, life, mean, like, make, poi... [../pdfs/Davidson 2018.pdf, 18]
19 7.0 0.3448 theology, experience, theological, tulud, cont... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
20 8.0 0.2205 group, community, religious, social, role, tim... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
21 20.0 0.4337 struggle, woman, life, oppression, feminist, e... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
22 18.0 0.6215 woman, feminist, oppression, tulud, particular... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
23 20.0 0.4992 struggle, woman, life, oppression, feminist, e... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
24 24.0 0.2211 filipino, philippine, hk, migrant, tulud, just... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
25 20.0 0.4348 struggle, woman, life, oppression, feminist, e... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
26 20.0 0.4400 struggle, woman, life, oppression, feminist, e... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
27 20.0 0.2912 struggle, woman, life, oppression, feminist, e... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
28 20.0 0.5509 struggle, woman, life, oppression, feminist, e... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
29 3.0 0.2221 migrant, country, home, community, family, exp... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
... ... ... ... ...
1089 7.0 0.5100 theology, experience, theological, tulud, cont... [../pdfs/Ahn 2018.pdf, 16]
1090 5.0 0.4263 social, political, economic, immigrant, societ... [../pdfs/Ahn 2018.pdf, 17]
1091 0.0 0.0400 black, experience, life, mean, like, make, poi... [../pdfs/Ahn 2018.pdf, 18]
1092 9.0 0.2080 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Haug 2018.pdf, 0]
1093 13.0 0.6800 new, press, ed, york, power, study, global, pe... [../pdfs/Haug 2018.pdf, 1]
1094 9.0 0.3467 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Haug 2018.pdf, 2]
1095 9.0 0.3467 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Haug 2018.pdf, 3]
1096 4.0 0.5095 migration, context, study, challenge, communit... [../pdfs/Haug 2018.pdf, 4]
1097 11.0 0.5200 religion, religious, culture, cultural, christ... [../pdfs/Haug 2018.pdf, 5]
1098 7.0 0.5200 theology, experience, theological, tulud, cont... [../pdfs/Haug 2018.pdf, 6]
1099 9.0 0.3467 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Haug 2018.pdf, 7]
1100 9.0 0.5771 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Haug 2018.pdf, 8]
1101 0.0 0.0400 black, experience, life, mean, like, make, poi... [../pdfs/Haug 2018.pdf, 9]
1102 4.0 0.5200 migration, context, study, challenge, communit... [../pdfs/Haug 2018.pdf, 10]
1103 0.0 0.0400 black, experience, life, mean, like, make, poi... [../pdfs/Haug 2018.pdf, 11]
1104 9.0 0.3467 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Haug 2018.pdf, 12]
1105 12.0 0.5200 work, place, family, home, like, case, make, m... [../pdfs/Haug 2018.pdf, 13]
1106 21.0 0.5100 god, people, power, faith, christian, life, wo... [../pdfs/Haug 2018.pdf, 14]
1107 0.0 0.0400 black, experience, life, mean, like, make, poi... [../pdfs/Haug 2018.pdf, 15]
1108 4.0 0.3467 migration, context, study, challenge, communit... [../pdfs/Cruz - 2010 - Preliminary Material.pd...
1109 7.0 0.5100 theology, experience, theological, tulud, cont... [../pdfs/Cruz - 2010 - Preliminary Material.pd...
1110 23.0 0.6080 hong, kong, tulud, filipina, filipino, g, huma... [../pdfs/Cruz - 2010 - Preliminary Material.pd...
1111 17.0 0.3276 research, form, study, mean, need, way, order,... [../pdfs/Cruz - 2010 - Preliminary Material.pd...
1112 23.0 0.5200 hong, kong, tulud, filipina, filipino, g, huma... [../pdfs/Cruz - 2010 - Preliminary Material.pd...
1113 23.0 0.3738 hong, kong, tulud, filipina, filipino, g, huma... [../pdfs/Cruz - 2010 - Preliminary Material.pd...
1114 5.0 0.2232 social, political, economic, immigrant, societ... [../pdfs/Cruz - 2010 - Preliminary Material.pd...
1115 20.0 0.3811 struggle, woman, life, oppression, feminist, e... [../pdfs/Cruz - 2010 - Preliminary Material.pd...
1116 20.0 0.5257 struggle, woman, life, oppression, feminist, e... [../pdfs/Cruz - 2010 - Preliminary Material.pd...
1117 2.0 0.4688 worker, domestic, migrant, filipina, condition... [../pdfs/Cruz - 2010 - Preliminary Material.pd...
1118 24.0 0.2966 filipino, philippine, hk, migrant, tulud, just... [../pdfs/Cruz - 2010 - Preliminary Material.pd...

1119 rows × 4 columns

The following code was used, and reused to show the details of a specific topic. This allowed us to see the parallels between the different documents.

sent_topics_df[sent_topics_df['Dominant_topic'] == 21.0].sort_values('Perc_Contrib', ascending=False)
Dominant_topic Perc_Contrib Topic_Keywords Text
647 21.0 0.7600 god, people, power, faith, christian, life, wo... [../pdfs/Thompson 2017.pdf, 3]
1047 21.0 0.7127 god, people, power, faith, christian, life, wo... [../pdfs/Izuzquiza - 2011 - Breaking bread not...
992 21.0 0.7060 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Five. A Differe...
1054 21.0 0.6937 god, people, power, faith, christian, life, wo... [../pdfs/Izuzquiza - 2011 - Breaking bread not...
548 21.0 0.6903 god, people, power, faith, christian, life, wo... [../pdfs/Nnamani 2015.pdf, 3]
520 21.0 0.6800 god, people, power, faith, christian, life, wo... [../pdfs/Strine 2018.pdf, 1]
810 21.0 0.6260 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Three. Expandin...
1053 21.0 0.6158 god, people, power, faith, christian, life, wo... [../pdfs/Izuzquiza - 2011 - Breaking bread not...
153 21.0 0.6090 god, people, power, faith, christian, life, wo... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
546 21.0 0.6017 god, people, power, faith, christian, life, wo... [../pdfs/Nnamani 2015.pdf, 1]
791 21.0 0.5950 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Three. Expandin...
1052 21.0 0.5829 god, people, power, faith, christian, life, wo... [../pdfs/Izuzquiza - 2011 - Breaking bread not...
646 21.0 0.5767 god, people, power, faith, christian, life, wo... [../pdfs/Thompson 2017.pdf, 2]
615 21.0 0.5324 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Four. Exploring...
618 21.0 0.5295 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Four. Exploring...
602 21.0 0.5227 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Four. Exploring...
807 21.0 0.5206 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Three. Expandin...
3 21.0 0.5200 god, people, power, faith, christian, life, wo... [../pdfs/Davidson 2018.pdf, 3]
1023 21.0 0.5200 god, people, power, faith, christian, life, wo... [../pdfs/Rowlands 2018.pdf, 3]
1087 21.0 0.5200 god, people, power, faith, christian, life, wo... [../pdfs/Ahn 2018.pdf, 14]
530 21.0 0.5200 god, people, power, faith, christian, life, wo... [../pdfs/Strine 2018.pdf, 11]
1106 21.0 0.5100 god, people, power, faith, christian, life, wo... [../pdfs/Haug 2018.pdf, 14]
251 21.0 0.4899 god, people, power, faith, christian, life, wo... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
625 21.0 0.4835 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Four. Exploring...
772 21.0 0.4811 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Three. Expandin...
215 21.0 0.4675 god, people, power, faith, christian, life, wo... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
610 21.0 0.4641 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Four. Exploring...
616 21.0 0.4612 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Four. Exploring...
773 21.0 0.4561 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Three. Expandin...
1081 21.0 0.4557 god, people, power, faith, christian, life, wo... [../pdfs/Ahn 2018.pdf, 8]
... ... ... ... ...
1061 21.0 0.3035 god, people, power, faith, christian, life, wo... [../pdfs/Izuzquiza - 2011 - Breaking bread not...
244 21.0 0.3016 god, people, power, faith, christian, life, wo... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
981 21.0 0.2992 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Five. A Differe...
228 21.0 0.2988 god, people, power, faith, christian, life, wo... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
1050 21.0 0.2878 god, people, power, faith, christian, life, wo... [../pdfs/Izuzquiza - 2011 - Breaking bread not...
770 21.0 0.2849 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Three. Expandin...
281 21.0 0.2819 god, people, power, faith, christian, life, wo... [../pdfs/Jimenez 2019.pdf, 5]
991 21.0 0.2799 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Five. A Differe...
463 21.0 0.2774 god, people, power, faith, christian, life, wo... [../pdfs/cruz2010.pdf, 31]
50 21.0 0.2774 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
155 21.0 0.2710 god, people, power, faith, christian, life, wo... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
572 21.0 0.2698 god, people, power, faith, christian, life, wo... [../pdfs/Soares et al 2017.pdf, 4]
74 21.0 0.2673 god, people, power, faith, christian, life, wo... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
212 21.0 0.2673 god, people, power, faith, christian, life, wo... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
968 21.0 0.2542 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Five. A Differe...
233 21.0 0.2530 god, people, power, faith, christian, life, wo... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
573 21.0 0.2505 god, people, power, faith, christian, life, wo... [../pdfs/Soares et al 2017.pdf, 5]
214 21.0 0.2483 god, people, power, faith, christian, life, wo... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
620 21.0 0.2476 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Four. Exploring...
511 21.0 0.2406 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Two. Frontiers ...
806 21.0 0.2403 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Three. Expandin...
578 21.0 0.2400 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Four. Exploring...
292 21.0 0.2394 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Introduction.pdf, 0]
621 21.0 0.2291 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Four. Exploring...
568 21.0 0.2285 god, people, power, faith, christian, life, wo... [../pdfs/Soares et al 2017.pdf, 0]
970 21.0 0.2233 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Five. A Differe...
35 21.0 0.2217 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
448 21.0 0.2217 god, people, power, faith, christian, life, wo... [../pdfs/cruz2010.pdf, 16]
340 21.0 0.2045 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter One. Geographie...
498 21.0 0.1989 god, people, power, faith, christian, life, wo... [../pdfs/Cruz - 2010 - Chapter Two. Frontiers ...

109 rows × 4 columns

To explore each topic was helpful, but one of the things we wanted to see was a shorter dataframe that had the topics and which document best exemplified those documents. The next cell groups the dataframe by the dominant topic, and the next cell creates a new dataframe so that just the best exemplified topics are portrayed.

grpd_df = sent_topics_df.groupby('Dominant_topic')
# This code creates a pandas DataFrame that shows which document is exemplified by which topic
new_df = pd.DataFrame()

for i, grp in grpd_df:
    new_df = pd.concat([new_df, grp.sort_values(['Perc_Contrib'], ascending=[0]).head(1)], axis=0)

new_df.reset_index(drop=True, inplace=True)
new_df.columns = ['Topic_Num', 'Topic_Perc_Contrib', 'Keywords', 'Text']
new_df
Topic_Num Topic_Perc_Contrib Keywords Text
0 0.0 0.5200 black, experience, life, mean, like, make, poi... [../pdfs/Rowlands 2018.pdf, 16]
1 1.0 0.5854 identity, challenge, term, experience, context... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
2 2.0 0.4688 worker, domestic, migrant, filipina, condition... [../pdfs/Cruz - 2010 - Preliminary Material.pd...
3 3.0 0.5200 migrant, country, home, community, family, exp... [../pdfs/Snyder 2018.pdf, 16]
4 4.0 0.5200 migration, context, study, challenge, communit... [../pdfs/Snyder 2018.pdf, 5]
5 5.0 0.7828 social, political, economic, immigrant, societ... [../pdfs/Jimenez 2019.pdf, 7]
6 6.0 0.6744 church, christian, american, immigrant, commun... [../pdfs/Nnamani 2015.pdf, 5]
7 7.0 0.6938 theology, experience, theological, tulud, cont... [../pdfs/cruz2010.pdf, 34]
8 8.0 0.4646 group, community, religious, social, role, tim... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
9 9.0 0.9751 œ, dorottya, martha, human, order, case, g, st... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
10 10.0 0.6800 say, way, make, problem, time, life, mean, peo... [../pdfs/Strine 2018.pdf, 8]
11 11.0 0.5710 religion, religious, culture, cultural, christ... [../pdfs/Nnamani 2015.pdf, 14]
12 12.0 0.6930 work, place, family, home, like, case, make, m... [../pdfs/Cruz - 2010 - Chapter Two. Frontiers ...
13 13.0 0.7600 new, press, ed, york, power, study, global, pe... [../pdfs/Snyder 2018.pdf, 17]
14 14.0 0.5200 right, human, word, reality, state, world, tim... [../pdfs/Davidson 2018.pdf, 7]
15 15.0 0.5200 service, relationship, sense, work, good, espe... [../pdfs/Rowlands 2018.pdf, 0]
16 16.0 0.5200 theological, book, human, faith, case, g, life... [../pdfs/Thompson 2017.pdf, 0]
17 17.0 0.5653 research, form, study, mean, need, way, order,... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
18 18.0 0.6215 woman, feminist, oppression, tulud, particular... [../pdfs/Cruz - 2010 - Chapter Six. Expanding ...
19 19.0 0.6800 mission, world, international, dorottya, marth... [../pdfs/Frederiks and Nagy - 2016 - Religion,...
20 20.0 0.6381 struggle, woman, life, oppression, feminist, e... [../pdfs/cruz2010.pdf, 27]
21 21.0 0.7600 god, people, power, faith, christian, life, wo... [../pdfs/Thompson 2017.pdf, 3]
22 23.0 0.7664 hong, kong, tulud, filipina, filipino, g, huma... [../pdfs/Cruz - 2010 - An intercultural theolo...
23 24.0 0.5721 filipino, philippine, hk, migrant, tulud, just... [../pdfs/Cruz - 2010 - Chapter One. Geographie...

Details of the Topic Model

One of the problems with topic modeling is that because it is an unsupervised clustering method, sometimes the computer sees connections that are not obvious, or at the vary least, are not semantic clusters. Topic model is a blunt tool, but we picked six of these topics that we thought might be helpful in discovering books over the past 100 years that might build on the topic we had chosen.

These topics are:

These topics were analysed in the context of the pdfs that generated them. These where the topics that we thought were both coherent, and might provide interesting analysis when looked at the political theology corpus generated from HathiTrust.

These are the only six topics we looked for in the HathiTrust corpus that we had identified.