Topic Modeling as a Tool for Resource Discovery
Once we have a topic model that has pretty good distribution and the bags of words have fairly coherent topics, we needed to explore the specific topics in the corpus. To do this, we created a pandas dataframe that works similarly to a spreadsheet, but allows all of the functionality of python on top of it.
The firest two cells import the necessary modules, and load the data.
import pandas as pd
import json
from gensim import corpora 
from gensim.models.ldamodel import LdaModel 
from gensim.corpora.dictionary import Dictionary
lda_model = LdaModel.load('./models/PrelimTOpicModel2') 
corpus_dict = Dictionary.load_from_text('./models/corpus_dictionary_2')
with open('./models/corpus.json', 'r') as fp:
    corpus = json.load(fp)
with open('./models/text_list.json', 'r') as fp:
    text_list = json.load(fp)
with open('./models/corpus_list.json', 'r') as fp:
    corpus_list = json.load(fp)
The following code is the primary function that creates the dataframe. This dataframe has a row for each page in the document. Which topic is dominant for the words on the page, and what the distinctive words are for the given topic. It also includes the pdf and page number for the document we are analyzing.
This allowed us to go back and look at the page for further context, in order to better understand the topics.
# this creates a pandas DataFrame that orders all of the topics and shows the dominant topic for each document
def format_topics_sent(ldamodel, corpus, texts):
    sent_topics_df = pd.DataFrame()
    for i, row in enumerate(ldamodel[corpus]):
        row = sorted(row[0], key=lambda x: x[1], reverse=True)
        
        for j, (topic_num, prop_topic) in enumerate(row):
            if j == 0:
                wp = ldamodel.show_topic(topic_num)
                topic_keywords = ", ".join([word for word, prop in wp])
                sent_topics_df = sent_topics_df.append(pd.Series([int(topic_num), round(prop_topic,4), topic_keywords]), ignore_index=True)
            else:
                break
    sent_topics_df.columns = ['Dominant_topic', 'Perc_Contrib', 'Topic_Keywords']
    contents = pd.Series(texts)
    sent_topics_df = pd.concat([sent_topics_df, contents], axis=1)
    sent_topics_df.rename(columns={0: "Text"}, inplace=True)
    return sent_topics_df
In order to better understand the specifics of this code, we can explore each particular row, by creating a generator to look at the rows.
def format_topics_sent_gen(ldamodel, corpus, texts):
    for i, row in enumerate(ldamodel[corpus]):
        yield row
row_generator = format_topics_sent_gen(lda_model, corpus, corpus_list)
row = next(row_generator)
row
([(0, 0.010000001),
  (1, 0.010000001),
  (2, 0.010000001),
  (3, 0.010000001),
  (4, 0.010000001),
  (5, 0.010000001),
  (6, 0.010000001),
  (7, 0.010000001),
  (8, 0.010000001),
  (9, 0.76),
  (10, 0.010000001),
  (11, 0.010000001),
  (12, 0.010000001),
  (13, 0.010000001),
  (14, 0.010000001),
  (15, 0.010000001),
  (16, 0.010000001),
  (17, 0.010000001),
  (18, 0.010000001),
  (19, 0.010000001),
  (20, 0.010000001),
  (21, 0.010000001),
  (22, 0.010000001),
  (23, 0.010000001),
  (24, 0.010000001)],
 [(0, [9])],
 [(0, [(9, 2.9999995)])])
For looking at the details of a specific topic, and its word distribution, you can query the lda_model directly. The topn variable shows how many items to display
lda_model.show_topic(21, topn=30)
[('god', 0.20524834),
 ('people', 0.08997392),
 ('power', 0.0812579),
 ('faith', 0.057230312),
 ('christian', 0.05547319),
 ('life', 0.05419738),
 ('word', 0.049899396),
 ('world', 0.045320693),
 ('way', 0.031713385),
 ('human', 0.030553361),
 ('reality', 0.025296446),
 ('experience', 0.023166098),
 ('doe', 0.021962931),
 ('tulud', 0.019804804),
 ('need', 0.016158376),
 ('especially', 0.013886022),
 ('like', 0.01281783),
 ('sense', 0.012462943),
 ('particularly', 0.011948879),
 ('fact', 0.011376779),
 ('just', 0.01132918),
 ('make', 0.011165283),
 ('time', 0.0108135445),
 ('g', 0.010585245),
 ('relation', 0.009754408),
 ('good', 0.00956607),
 ('example', 0.009151671),
 ('culture', 0.008596516),
 ('context', 0.008572249),
 ('challenge', 0.007773363)]
sent_topics_df = format_topics_sent(lda_model, corpus, text_list)
sent_topics_df
| Dominant_topic | Perc_Contrib | Topic_Keywords | Text | |
|---|---|---|---|---|
| 0 | 9.0 | 0.7600 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Davidson 2018.pdf, 0] | 
| 1 | 9.0 | 0.6080 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Davidson 2018.pdf, 1] | 
| 2 | 9.0 | 0.6800 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Davidson 2018.pdf, 2] | 
| 3 | 21.0 | 0.5200 | god, people, power, faith, christian, life, wo... | [../pdfs/Davidson 2018.pdf, 3] | 
| 4 | 9.0 | 0.6040 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Davidson 2018.pdf, 4] | 
| 5 | 9.0 | 0.8629 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Davidson 2018.pdf, 5] | 
| 6 | 9.0 | 0.8080 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Davidson 2018.pdf, 6] | 
| 7 | 14.0 | 0.5200 | right, human, word, reality, state, world, tim... | [../pdfs/Davidson 2018.pdf, 7] | 
| 8 | 0.0 | 0.0400 | black, experience, life, mean, like, make, poi... | [../pdfs/Davidson 2018.pdf, 8] | 
| 9 | 9.0 | 0.6800 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Davidson 2018.pdf, 9] | 
| 10 | 9.0 | 0.8400 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Davidson 2018.pdf, 10] | 
| 11 | 9.0 | 0.7600 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Davidson 2018.pdf, 11] | 
| 12 | 0.0 | 0.0400 | black, experience, life, mean, like, make, poi... | [../pdfs/Davidson 2018.pdf, 12] | 
| 13 | 17.0 | 0.5200 | research, form, study, mean, need, way, order,... | [../pdfs/Davidson 2018.pdf, 13] | 
| 14 | 9.0 | 0.5200 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Davidson 2018.pdf, 14] | 
| 15 | 5.0 | 0.5200 | social, political, economic, immigrant, societ... | [../pdfs/Davidson 2018.pdf, 15] | 
| 16 | 13.0 | 0.4080 | new, press, ed, york, power, study, global, pe... | [../pdfs/Davidson 2018.pdf, 16] | 
| 17 | 14.0 | 0.4080 | right, human, word, reality, state, world, tim... | [../pdfs/Davidson 2018.pdf, 17] | 
| 18 | 0.0 | 0.0400 | black, experience, life, mean, like, make, poi... | [../pdfs/Davidson 2018.pdf, 18] | 
| 19 | 7.0 | 0.3448 | theology, experience, theological, tulud, cont... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 20 | 8.0 | 0.2205 | group, community, religious, social, role, tim... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 21 | 20.0 | 0.4337 | struggle, woman, life, oppression, feminist, e... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 22 | 18.0 | 0.6215 | woman, feminist, oppression, tulud, particular... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 23 | 20.0 | 0.4992 | struggle, woman, life, oppression, feminist, e... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 24 | 24.0 | 0.2211 | filipino, philippine, hk, migrant, tulud, just... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 25 | 20.0 | 0.4348 | struggle, woman, life, oppression, feminist, e... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 26 | 20.0 | 0.4400 | struggle, woman, life, oppression, feminist, e... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 27 | 20.0 | 0.2912 | struggle, woman, life, oppression, feminist, e... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 28 | 20.0 | 0.5509 | struggle, woman, life, oppression, feminist, e... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 29 | 3.0 | 0.2221 | migrant, country, home, community, family, exp... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| ... | ... | ... | ... | ... | 
| 1089 | 7.0 | 0.5100 | theology, experience, theological, tulud, cont... | [../pdfs/Ahn 2018.pdf, 16] | 
| 1090 | 5.0 | 0.4263 | social, political, economic, immigrant, societ... | [../pdfs/Ahn 2018.pdf, 17] | 
| 1091 | 0.0 | 0.0400 | black, experience, life, mean, like, make, poi... | [../pdfs/Ahn 2018.pdf, 18] | 
| 1092 | 9.0 | 0.2080 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Haug 2018.pdf, 0] | 
| 1093 | 13.0 | 0.6800 | new, press, ed, york, power, study, global, pe... | [../pdfs/Haug 2018.pdf, 1] | 
| 1094 | 9.0 | 0.3467 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Haug 2018.pdf, 2] | 
| 1095 | 9.0 | 0.3467 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Haug 2018.pdf, 3] | 
| 1096 | 4.0 | 0.5095 | migration, context, study, challenge, communit... | [../pdfs/Haug 2018.pdf, 4] | 
| 1097 | 11.0 | 0.5200 | religion, religious, culture, cultural, christ... | [../pdfs/Haug 2018.pdf, 5] | 
| 1098 | 7.0 | 0.5200 | theology, experience, theological, tulud, cont... | [../pdfs/Haug 2018.pdf, 6] | 
| 1099 | 9.0 | 0.3467 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Haug 2018.pdf, 7] | 
| 1100 | 9.0 | 0.5771 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Haug 2018.pdf, 8] | 
| 1101 | 0.0 | 0.0400 | black, experience, life, mean, like, make, poi... | [../pdfs/Haug 2018.pdf, 9] | 
| 1102 | 4.0 | 0.5200 | migration, context, study, challenge, communit... | [../pdfs/Haug 2018.pdf, 10] | 
| 1103 | 0.0 | 0.0400 | black, experience, life, mean, like, make, poi... | [../pdfs/Haug 2018.pdf, 11] | 
| 1104 | 9.0 | 0.3467 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Haug 2018.pdf, 12] | 
| 1105 | 12.0 | 0.5200 | work, place, family, home, like, case, make, m... | [../pdfs/Haug 2018.pdf, 13] | 
| 1106 | 21.0 | 0.5100 | god, people, power, faith, christian, life, wo... | [../pdfs/Haug 2018.pdf, 14] | 
| 1107 | 0.0 | 0.0400 | black, experience, life, mean, like, make, poi... | [../pdfs/Haug 2018.pdf, 15] | 
| 1108 | 4.0 | 0.3467 | migration, context, study, challenge, communit... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
| 1109 | 7.0 | 0.5100 | theology, experience, theological, tulud, cont... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
| 1110 | 23.0 | 0.6080 | hong, kong, tulud, filipina, filipino, g, huma... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
| 1111 | 17.0 | 0.3276 | research, form, study, mean, need, way, order,... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
| 1112 | 23.0 | 0.5200 | hong, kong, tulud, filipina, filipino, g, huma... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
| 1113 | 23.0 | 0.3738 | hong, kong, tulud, filipina, filipino, g, huma... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
| 1114 | 5.0 | 0.2232 | social, political, economic, immigrant, societ... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
| 1115 | 20.0 | 0.3811 | struggle, woman, life, oppression, feminist, e... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
| 1116 | 20.0 | 0.5257 | struggle, woman, life, oppression, feminist, e... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
| 1117 | 2.0 | 0.4688 | worker, domestic, migrant, filipina, condition... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
| 1118 | 24.0 | 0.2966 | filipino, philippine, hk, migrant, tulud, just... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
1119 rows × 4 columns
The following code was used, and reused to show the details of a specific topic. This allowed us to see the parallels between the different documents.
sent_topics_df[sent_topics_df['Dominant_topic'] == 21.0].sort_values('Perc_Contrib', ascending=False)
| Dominant_topic | Perc_Contrib | Topic_Keywords | Text | |
|---|---|---|---|---|
| 647 | 21.0 | 0.7600 | god, people, power, faith, christian, life, wo... | [../pdfs/Thompson 2017.pdf, 3] | 
| 1047 | 21.0 | 0.7127 | god, people, power, faith, christian, life, wo... | [../pdfs/Izuzquiza - 2011 - Breaking bread not... | 
| 992 | 21.0 | 0.7060 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Five. A Differe... | 
| 1054 | 21.0 | 0.6937 | god, people, power, faith, christian, life, wo... | [../pdfs/Izuzquiza - 2011 - Breaking bread not... | 
| 548 | 21.0 | 0.6903 | god, people, power, faith, christian, life, wo... | [../pdfs/Nnamani 2015.pdf, 3] | 
| 520 | 21.0 | 0.6800 | god, people, power, faith, christian, life, wo... | [../pdfs/Strine 2018.pdf, 1] | 
| 810 | 21.0 | 0.6260 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Three. Expandin... | 
| 1053 | 21.0 | 0.6158 | god, people, power, faith, christian, life, wo... | [../pdfs/Izuzquiza - 2011 - Breaking bread not... | 
| 153 | 21.0 | 0.6090 | god, people, power, faith, christian, life, wo... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 546 | 21.0 | 0.6017 | god, people, power, faith, christian, life, wo... | [../pdfs/Nnamani 2015.pdf, 1] | 
| 791 | 21.0 | 0.5950 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Three. Expandin... | 
| 1052 | 21.0 | 0.5829 | god, people, power, faith, christian, life, wo... | [../pdfs/Izuzquiza - 2011 - Breaking bread not... | 
| 646 | 21.0 | 0.5767 | god, people, power, faith, christian, life, wo... | [../pdfs/Thompson 2017.pdf, 2] | 
| 615 | 21.0 | 0.5324 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Four. Exploring... | 
| 618 | 21.0 | 0.5295 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Four. Exploring... | 
| 602 | 21.0 | 0.5227 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Four. Exploring... | 
| 807 | 21.0 | 0.5206 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Three. Expandin... | 
| 3 | 21.0 | 0.5200 | god, people, power, faith, christian, life, wo... | [../pdfs/Davidson 2018.pdf, 3] | 
| 1023 | 21.0 | 0.5200 | god, people, power, faith, christian, life, wo... | [../pdfs/Rowlands 2018.pdf, 3] | 
| 1087 | 21.0 | 0.5200 | god, people, power, faith, christian, life, wo... | [../pdfs/Ahn 2018.pdf, 14] | 
| 530 | 21.0 | 0.5200 | god, people, power, faith, christian, life, wo... | [../pdfs/Strine 2018.pdf, 11] | 
| 1106 | 21.0 | 0.5100 | god, people, power, faith, christian, life, wo... | [../pdfs/Haug 2018.pdf, 14] | 
| 251 | 21.0 | 0.4899 | god, people, power, faith, christian, life, wo... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 625 | 21.0 | 0.4835 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Four. Exploring... | 
| 772 | 21.0 | 0.4811 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Three. Expandin... | 
| 215 | 21.0 | 0.4675 | god, people, power, faith, christian, life, wo... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 610 | 21.0 | 0.4641 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Four. Exploring... | 
| 616 | 21.0 | 0.4612 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Four. Exploring... | 
| 773 | 21.0 | 0.4561 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Three. Expandin... | 
| 1081 | 21.0 | 0.4557 | god, people, power, faith, christian, life, wo... | [../pdfs/Ahn 2018.pdf, 8] | 
| ... | ... | ... | ... | ... | 
| 1061 | 21.0 | 0.3035 | god, people, power, faith, christian, life, wo... | [../pdfs/Izuzquiza - 2011 - Breaking bread not... | 
| 244 | 21.0 | 0.3016 | god, people, power, faith, christian, life, wo... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 981 | 21.0 | 0.2992 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Five. A Differe... | 
| 228 | 21.0 | 0.2988 | god, people, power, faith, christian, life, wo... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 1050 | 21.0 | 0.2878 | god, people, power, faith, christian, life, wo... | [../pdfs/Izuzquiza - 2011 - Breaking bread not... | 
| 770 | 21.0 | 0.2849 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Three. Expandin... | 
| 281 | 21.0 | 0.2819 | god, people, power, faith, christian, life, wo... | [../pdfs/Jimenez 2019.pdf, 5] | 
| 991 | 21.0 | 0.2799 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Five. A Differe... | 
| 463 | 21.0 | 0.2774 | god, people, power, faith, christian, life, wo... | [../pdfs/cruz2010.pdf, 31] | 
| 50 | 21.0 | 0.2774 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 155 | 21.0 | 0.2710 | god, people, power, faith, christian, life, wo... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 572 | 21.0 | 0.2698 | god, people, power, faith, christian, life, wo... | [../pdfs/Soares et al 2017.pdf, 4] | 
| 74 | 21.0 | 0.2673 | god, people, power, faith, christian, life, wo... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 212 | 21.0 | 0.2673 | god, people, power, faith, christian, life, wo... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 968 | 21.0 | 0.2542 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Five. A Differe... | 
| 233 | 21.0 | 0.2530 | god, people, power, faith, christian, life, wo... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 573 | 21.0 | 0.2505 | god, people, power, faith, christian, life, wo... | [../pdfs/Soares et al 2017.pdf, 5] | 
| 214 | 21.0 | 0.2483 | god, people, power, faith, christian, life, wo... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 620 | 21.0 | 0.2476 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Four. Exploring... | 
| 511 | 21.0 | 0.2406 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Two. Frontiers ... | 
| 806 | 21.0 | 0.2403 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Three. Expandin... | 
| 578 | 21.0 | 0.2400 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Four. Exploring... | 
| 292 | 21.0 | 0.2394 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Introduction.pdf, 0] | 
| 621 | 21.0 | 0.2291 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Four. Exploring... | 
| 568 | 21.0 | 0.2285 | god, people, power, faith, christian, life, wo... | [../pdfs/Soares et al 2017.pdf, 0] | 
| 970 | 21.0 | 0.2233 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Five. A Differe... | 
| 35 | 21.0 | 0.2217 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 448 | 21.0 | 0.2217 | god, people, power, faith, christian, life, wo... | [../pdfs/cruz2010.pdf, 16] | 
| 340 | 21.0 | 0.2045 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter One. Geographie... | 
| 498 | 21.0 | 0.1989 | god, people, power, faith, christian, life, wo... | [../pdfs/Cruz - 2010 - Chapter Two. Frontiers ... | 
109 rows × 4 columns
To explore each topic was helpful, but one of the things we wanted to see was a shorter dataframe that had the topics and which document best exemplified those documents. The next cell groups the dataframe by the dominant topic, and the next cell creates a new dataframe so that just the best exemplified topics are portrayed.
grpd_df = sent_topics_df.groupby('Dominant_topic')
# This code creates a pandas DataFrame that shows which document is exemplified by which topic
new_df = pd.DataFrame()
for i, grp in grpd_df:
    new_df = pd.concat([new_df, grp.sort_values(['Perc_Contrib'], ascending=[0]).head(1)], axis=0)
new_df.reset_index(drop=True, inplace=True)
new_df.columns = ['Topic_Num', 'Topic_Perc_Contrib', 'Keywords', 'Text']
new_df
| Topic_Num | Topic_Perc_Contrib | Keywords | Text | |
|---|---|---|---|---|
| 0 | 0.0 | 0.5200 | black, experience, life, mean, like, make, poi... | [../pdfs/Rowlands 2018.pdf, 16] | 
| 1 | 1.0 | 0.5854 | identity, challenge, term, experience, context... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 2 | 2.0 | 0.4688 | worker, domestic, migrant, filipina, condition... | [../pdfs/Cruz - 2010 - Preliminary Material.pd... | 
| 3 | 3.0 | 0.5200 | migrant, country, home, community, family, exp... | [../pdfs/Snyder 2018.pdf, 16] | 
| 4 | 4.0 | 0.5200 | migration, context, study, challenge, communit... | [../pdfs/Snyder 2018.pdf, 5] | 
| 5 | 5.0 | 0.7828 | social, political, economic, immigrant, societ... | [../pdfs/Jimenez 2019.pdf, 7] | 
| 6 | 6.0 | 0.6744 | church, christian, american, immigrant, commun... | [../pdfs/Nnamani 2015.pdf, 5] | 
| 7 | 7.0 | 0.6938 | theology, experience, theological, tulud, cont... | [../pdfs/cruz2010.pdf, 34] | 
| 8 | 8.0 | 0.4646 | group, community, religious, social, role, tim... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 9 | 9.0 | 0.9751 | œ, dorottya, martha, human, order, case, g, st... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 10 | 10.0 | 0.6800 | say, way, make, problem, time, life, mean, peo... | [../pdfs/Strine 2018.pdf, 8] | 
| 11 | 11.0 | 0.5710 | religion, religious, culture, cultural, christ... | [../pdfs/Nnamani 2015.pdf, 14] | 
| 12 | 12.0 | 0.6930 | work, place, family, home, like, case, make, m... | [../pdfs/Cruz - 2010 - Chapter Two. Frontiers ... | 
| 13 | 13.0 | 0.7600 | new, press, ed, york, power, study, global, pe... | [../pdfs/Snyder 2018.pdf, 17] | 
| 14 | 14.0 | 0.5200 | right, human, word, reality, state, world, tim... | [../pdfs/Davidson 2018.pdf, 7] | 
| 15 | 15.0 | 0.5200 | service, relationship, sense, work, good, espe... | [../pdfs/Rowlands 2018.pdf, 0] | 
| 16 | 16.0 | 0.5200 | theological, book, human, faith, case, g, life... | [../pdfs/Thompson 2017.pdf, 0] | 
| 17 | 17.0 | 0.5653 | research, form, study, mean, need, way, order,... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 18 | 18.0 | 0.6215 | woman, feminist, oppression, tulud, particular... | [../pdfs/Cruz - 2010 - Chapter Six. Expanding ... | 
| 19 | 19.0 | 0.6800 | mission, world, international, dorottya, marth... | [../pdfs/Frederiks and Nagy - 2016 - Religion,... | 
| 20 | 20.0 | 0.6381 | struggle, woman, life, oppression, feminist, e... | [../pdfs/cruz2010.pdf, 27] | 
| 21 | 21.0 | 0.7600 | god, people, power, faith, christian, life, wo... | [../pdfs/Thompson 2017.pdf, 3] | 
| 22 | 23.0 | 0.7664 | hong, kong, tulud, filipina, filipino, g, huma... | [../pdfs/Cruz - 2010 - An intercultural theolo... | 
| 23 | 24.0 | 0.5721 | filipino, philippine, hk, migrant, tulud, just... | [../pdfs/Cruz - 2010 - Chapter One. Geographie... | 
One of the problems with topic modeling is that because it is an unsupervised clustering method, sometimes the computer sees connections that are not obvious, or at the vary least, are not semantic clusters. Topic model is a blunt tool, but we picked six of these topics that we thought might be helpful in discovering books over the past 100 years that might build on the topic we had chosen.
These topics are:
These topics were analysed in the context of the pdfs that generated them. These where the topics that we thought were both coherent, and might provide interesting analysis when looked at the political theology corpus generated from HathiTrust.
These are the only six topics we looked for in the HathiTrust corpus that we had identified.