Assignment 9 - Working With Textual Data (Solutions)

This exercise is designed to get you working with the quanteda package and some other associated packages. The focus will be on exploring the package, getting some texts into the corpus object format, learning how to convert texts into document-feature matrices, and performing descriptive analyses on this data.

Data

Presidential Inaugural Corpus – inaugural.csv

This data includes the texts of 59 US presidential inaugural address texts from 1789 to present. It also includes the following variables

Variable	Description
`Year`	Year of inaugural address
`President`	President’s last name
`FirstName`	President’s first name (and possibly middle initial)
`Party`	Name of the President’s political party
`text`	Text of the inaugural address

Once you have downloaded this files and stored them somewhere sensible, you can load them into R using the following commands:

inaugural <- read.csv("inaugural.csv")

1. Getting Started.

You will first need to install and load the following packages:

install.packages("quanteda")
install.packages("readtext")
install.packages("quanteda.textplots")
install.packages("quanteda.textstats")

library(quanteda)
library(quanteda.textplots)
library(quanteda.textstats)
library(readtext)

You will also need to install the package quanteda.corpora from github using the install_github function from the devtools package:

devtools::install_github("quanteda/quanteda.corpora")
library(quanteda.corpora)

Exploring quanteda functions. Look at the Quick Start vignette, and browse the manual for quanteda. You can use example() function for any function in the package, to run the examples and see how the function works. Of course you should also browse the documentation, especially ?corpus to see the structure and operations of how to construct a corpus. The website http://quanteda.io has extensive documentation.

?corpus
example(dfm)
example(corpus)

2. Making a corpus and corpus structure

A corpus object is the foundation for all the analysis we will be doing in quanteda. The first thing to do when you load some text data into R is to convert it using the corpus() function.

The simplest way to create a corpus is to use a set of texts already present in R’s global environment. In our case, we previously loaded the inaugural.csv file and stored it as the inaugural object. Let’s have a look at this object to see what it contains. Use the head() function applied to the inaugural object and report the output. Which variable includes the texts of the inaugural addresses?

head(inaugural)

##   Year  President FirstName                 Party
## 1 1789 Washington    George                  none
## 2 1793 Washington    George                  none
## 3 1797      Adams      John            Federalist
## 4 1801  Jefferson    Thomas Democratic-Republican
## 5 1805  Jefferson    Thomas Democratic-Republican
## 6 1809    Madison     James Democratic-Republican
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              text
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Fellow-Citizens of the Senate and of the House of Representatives:\n\nAmong the vicissitudes incident to life no event could have filled me with greater anxieties than that of which the notification was transmitted by your order, and received on the 14th day of the present month. On the one hand, I was summoned by my Country, whose voice I can never hear but with veneration and love, from a retreat which I had chosen with the fondest predilection, and, in my flattering hopes, with an immutable decision, as the asylum of my declining years  -  a retreat which was rendered every day more necessary as well as more dear to me by the addition of habit to inclination, and of frequent interruptions in my health to the gradual waste committed on it by time. On the other hand, the magnitude and difficulty of the trust to which the voice of my country called me, being sufficient to awaken in the wisest and most experienced of her citizens a distrustful scrutiny into his qualifications, could not but overwhelm with despondence one who (inheriting inferior endowments from nature and unpracticed in the duties of civil administration) ought to be peculiarly conscious of his own deficiencies. In this conflict of emotions all I dare aver is that it has been my faithful study to collect my duty from a just appreciation of every circumstance by which it might be affected. All I dare hope is that if, in executing this task, I have been too much swayed by a grateful remembrance of former instances, or by an affectionate sensibility to this transcendent proof of the confidence of my fellow citizens, and have thence too little consulted my incapacity as well as disinclination for the weighty and untried cares before me, my error will be palliated by the motives which mislead me, and its consequences be judged by my country with some share of the partiality in which they originated.\n\nSuch being the impressions under which I have, in obedience to the public summons, repaired to the present station, it would be peculiarly improper to omit in this first official act my fervent supplications to that Almighty Being who rules over the universe, who presides in the councils of nations, and whose providential aids can supply every human defect, that His benediction may consecrate to the liberties and happiness of the people of the United States a Government instituted by themselves for these essential purposes, and may enable every instrument employed in its administration to execute with success the functions allotted to his charge. In tendering this homage to the Great Author of every public and private good, I assure myself that it expresses your sentiments not less than my own, nor those of my fellow citizens at large less than either. No people can be bound to acknowledge and adore the Invisible Hand which conducts the affairs of men more than those of the United States. Every step by which they have advanced to the character of an independent nation seems to have been distinguished by some token of providential agency; and in the important revolution just accomplished in the system of their united government the tranquil deliberations and voluntary consent of so many distinct communities from which the event has resulted can not be compared with the means by which most governments have been established without some return of pious gratitude, along with an humble anticipation of the future blessings which the past seem to presage. These reflections, arising out of the present crisis, have forced themselves too strongly on my mind to be suppressed. You will join with me, I trust, in thinking that there are none under the influence of which the proceedings of a new and free government can more auspiciously commence.\n\nBy the article establishing the executive department it is made the duty of the President "to recommend to your consideration such measures as he shall judge necessary and expedient." The circumstances under which I now meet you will acquit me from entering into that subject further than to refer to the great constitutional charter under which you are assembled, and which, in defining your powers, designates the objects to which your attention is to be given. It will be more consistent with those circumstances, and far more congenial with the feelings which actuate me, to substitute, in place of a recommendation of particular measures, the tribute that is due to the talents, the rectitude, and the patriotism which adorn the characters selected to devise and adopt them. In these honorable qualifications I behold the surest pledges that as on one side no local prejudices or attachments, no separate views nor party animosities, will misdirect the comprehensive and equal eye which ought to watch over this great assemblage of communities and interests, so, on another, that the foundation of our national policy will be laid in the pure and immutable principles of private morality, and the preeminence of free government be exemplified by all the attributes which can win the affections of its citizens and command the respect of the world. I dwell on this prospect with every satisfaction which an ardent love for my country can inspire, since there is no truth more thoroughly established than that there exists in the economy and course of nature an indissoluble union between virtue and happiness; between duty and advantage; between the genuine maxims of an honest and magnanimous policy and the solid rewards of public prosperity and felicity; since we ought to be no less persuaded that the propitious smiles of Heaven can never be expected on a nation that disregards the eternal rules of order and right which Heaven itself has ordained; and since the preservation of the sacred fire of liberty and the destiny of the republican model of government are justly considered, perhaps, as deeply, as finally, staked on the experiment entrusted to the hands of the American people.\n\nBesides the ordinary objects submitted to your care, it will remain with your judgment to decide how far an exercise of the occasional power delegated by the fifth article of the Constitution is rendered expedient at the present juncture by the nature of objections which have been urged against the system, or by the degree of inquietude which has given birth to them. Instead of undertaking particular recommendations on this subject, in which I could be guided by no lights derived from official opportunities, I shall again give way to my entire confidence in your discernment and pursuit of the public good; for I assure myself that whilst you carefully avoid every alteration which might endanger the benefits of an united and effective government, or which ought to await the future lessons of experience, a reverence for the characteristic rights of freemen and a regard for the public harmony will sufficiently influence your deliberations on the question how far the former can be impregnably fortified or the latter be safely and advantageously promoted.\n\nTo the foregoing observations I have one to add, which will be most properly addressed to the House of Representatives. It concerns myself, and will therefore be as brief as possible. When I was first honored with a call into the service of my country, then on the eve of an arduous struggle for its liberties, the light in which I contemplated my duty required that I should renounce every pecuniary compensation. From this resolution I have in no instance departed; and being still under the impressions which produced it, I must decline as inapplicable to myself any share in the personal emoluments which may be indispensably included in a permanent provision for the executive department, and must accordingly pray that the pecuniary estimates for the station in which I am placed may during my continuance in it be limited to such actual expenditures as the public good may be thought to require.\n\nHaving thus imparted to you my sentiments as they have been awakened by the occasion which brings us together, I shall take my present leave; but not without resorting once more to the benign Parent of the Human Race in humble supplication that, since He has been pleased to favor the American people with opportunities for deliberating in perfect tranquillity, and dispositions for deciding with unparalleled unanimity on a form of government for the security of their union and the advancement of their happiness, so His divine blessing may be equally conspicuous in the enlarged views, the temperate consultations, and the wise measures on which the success of this Government must depend. 
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Fellow citizens, I am again called upon by the voice of my country to execute the functions of its Chief Magistrate. When the occasion proper for it shall arrive, I shall endeavor to express the high sense I entertain of this distinguished honor, and of the confidence which has been reposed in me by the people of united America.\n\nPrevious to the execution of any official act of the President the Constitution requires an oath of office. This oath I am now about to take, and in your presence: That if it shall be found during my administration of the Government I have in any instance violated willingly or knowingly the injunctions thereof, I may (besides incurring constitutional punishment) be subject to the upbraidings of all who are now witnesses of the present solemn ceremony.\n\n 
## 3 When it was first perceived, in early times, that no middle course for America remained between unlimited submission to a foreign legislature and a total independence of its claims, men of reflection were less apprehensive of danger from the formidable power of fleets and armies they must determine to resist than from those contests and dissensions which would certainly arise concerning the forms of government to be instituted over the whole and over the parts of this extensive country. Relying, however, on the purity of their intentions, the justice of their cause, and the integrity and intelligence of the people, under an overruling Providence which had so signally protected this country from the first, the representatives of this nation, then consisting of little more than half its present number, not only broke to pieces the chains which were forging and the rod of iron that was lifted up, but frankly cut asunder the ties which had bound them, and launched into an ocean of uncertainty.\n\nThe zeal and ardor of the people during the Revolutionary war, supplying the place of government, commanded a degree of order sufficient at least for the temporary preservation of society. The Confederation which was early felt to be necessary was prepared from the models of the Batavian and Helvetic confederacies, the only examples which remain with any detail and precision in history, and certainly the only ones which the people at large had ever considered. But reflecting on the striking difference in so many particulars between this country and those where a courier may go from the seat of government to the frontier in a single day, it was then certainly foreseen by some who assisted in Congress at the formation of it that it could not be durable.\n\nNegligence of its regulations, inattention to its recommendations, if not disobedience to its authority, not only in individuals but in States, soon appeared with their melancholy consequences  -  universal languor, jealousies and rivalries of States, decline of navigation and commerce, discouragement of necessary manufactures, universal fall in the value of lands and their produce, contempt of public and private faith, loss of consideration and credit with foreign nations, and at length in discontents, animosities, combinations, partial conventions, and insurrection, threatening some great national calamity.\n\nIn this dangerous crisis the people of America were not abandoned by their usual good sense, presence of mind, resolution, or integrity. Measures were pursued to concert a plan to form a more perfect union, establish justice, insure domestic tranquillity, provide for the common defense, promote the general welfare, and secure the blessings of liberty. The public disquisitions, discussions, and deliberations issued in the present happy Constitution of Government.\n\nEmployed in the service of my country abroad during the whole course of these transactions, I first saw the Constitution of the United States in a foreign country. Irritated by no literary altercation, animated by no public debate, heated by no party animosity, I read it with great satisfaction, as the result of good heads prompted by good hearts, as an experiment better adapted to the genius, character, situation, and relations of this nation and country than any which had ever been proposed or suggested. In its general principles and great outlines it was conformable to such a system of government as I had ever most esteemed, and in some States, my own native State in particular, had contributed to establish. Claiming a right of suffrage, in common with my fellow-citizens, in the adoption or rejection of a constitution which was to rule me and my posterity, as well as them and theirs, I did not hesitate to express my approbation of it on all occasions, in public and in private. It was not then, nor has been since, any objection to it in my mind that the Executive and Senate were not more permanent. Nor have I ever entertained a thought of promoting any alteration in it but such as the people themselves, in the course of their experience, should see and feel to be necessary or expedient, and by their representatives in Congress and the State legislatures, according to the Constitution itself, adopt and ordain.\n\nReturning to the bosom of my country after a painful separation from it for ten years, I had the honor to be elected to a station under the new order of things, and I have repeatedly laid myself under the most serious obligations to support the Constitution. The operation of it has equaled the most sanguine expectations of its friends, and from an habitual attention to it, satisfaction in its administration, and delight in its effects upon the peace, order, prosperity, and happiness of the nation I have acquired an habitual attachment to it and veneration for it.\n\nWhat other form of government, indeed, can so well deserve our esteem and love?\n\nThere may be little solidity in an ancient idea that congregations of men into cities and nations are the most pleasing objects in the sight of superior intelligences, but this is very certain, that to a benevolent human mind there can be no spectacle presented by any nation more pleasing, more noble, majestic, or august, than an assembly like that which has so often been seen in this and the other Chamber of Congress, of a Government in which the Executive authority, as well as that of all the branches of the Legislature, are exercised by citizens selected at regular periods by their neighbors to make and execute laws for the general good. Can anything essential, anything more than mere ornament and decoration, be added to this by robes and diamonds? Can authority be more amiable and respectable when it descends from accidents or institutions established in remote antiquity than when it springs fresh from the hearts and judgments of an honest and enlightened people? For it is the people only that are represented. It is their power and majesty that is reflected, and only for their good, in every legitimate government, under whatever form it may appear. The existence of such a government as ours for any length of time is a full proof of a general dissemination of knowledge and virtue throughout the whole body of the people. And what object or consideration more pleasing than this can be presented to the human mind? If national pride is ever justifiable or excusable it is when it springs, not from power or riches, grandeur or glory, but from conviction of national innocence, information, and benevolence.\n\nIn the midst of these pleasing ideas we should be unfaithful to ourselves if we should ever lose sight of the danger to our liberties if anything partial or extraneous should infect the purity of our free, fair, virtuous, and independent elections. If an election is to be determined by a majority of a single vote, and that can be procured by a party through artifice or corruption, the Government may be the choice of a party for its own ends, not of the nation for the national good. If that solitary suffrage can be obtained by foreign nations by flattery or menaces, by fraud or violence, by terror, intrigue, or venality, the Government may not be the choice of the American people, but of foreign nations. It may be foreign nations who govern us, and not we, the people, who govern ourselves; and candid men will acknowledge that in such cases choice would have little advantage to boast of over lot or chance.\n\nSuch is the amiable and interesting system of government (and such are some of the abuses to which it may be exposed) which the people of America have exhibited to the admiration and anxiety of the wise and virtuous of all nations for eight years under the administration of a citizen who, by a long course of great actions, regulated by prudence, justice, temperance, and fortitude, conducting a people inspired with the same virtues and animated with the same ardent patriotism and love of liberty to independence and peace, to increasing wealth and unexampled prosperity, has merited the gratitude of his fellow-citizens, commanded the highest praises of foreign nations, and secured immortal glory with posterity.\n\nIn that retirement which is his voluntary choice may he long live to enjoy the delicious recollection of his services, the gratitude of mankind, the happy fruits of them to himself and the world, which are daily increasing, and that splendid prospect of the future fortunes of this country which is opening from year to year. His name may be still a rampart, and the knowledge that he lives a bulwark, against all open or secret enemies of his country's peace. This example has been recommended to the imitation of his successors by both Houses of Congress and by the voice of the legislatures and the people throughout the nation.\n\nOn this subject it might become me better to be silent or to speak with diffidence; but as something may be expected, the occasion, I hope, will be admitted as an apology if I venture to say that if a preference, upon principle, of a free republican government, formed upon long and serious reflection, after a diligent and impartial inquiry after truth; if an attachment to the Constitution of the United States, and a conscientious determination to support it until it shall be altered by the judgments and wishes of the people, expressed in the mode prescribed in it; if a respectful attention to the constitutions of the individual States and a constant caution and delicacy toward the State governments; if an equal and impartial regard to the rights, interest, honor, and happiness of all the States in the Union, without preference or regard to a northern or southern, an eastern or western, position, their various political opinions on unessential points or their personal attachments; if a love of virtuous men of all parties and denominations; if a love of science and letters and a wish to patronize every rational effort to encourage schools, colleges, universities, academies, and every institution for propagating knowledge, virtue, and religion among all classes of the people, not only for their benign influence on the happiness of life in all its stages and classes, and of society in all its forms, but as the only means of preserving our Constitution from its natural enemies, the spirit of sophistry, the spirit of party, the spirit of intrigue, the profligacy of corruption, and the pestilence of foreign influence, which is the angel of destruction to elective governments; if a love of equal laws, of justice, and humanity in the interior administration; if an inclination to improve agriculture, commerce, and manufacturers for necessity, convenience, and defense; if a spirit of equity and humanity toward the aboriginal nations of America, and a disposition to meliorate their condition by inclining them to be more friendly to us, and our citizens to be more friendly to them; if an inflexible determination to maintain peace and inviolable faith with all nations, and that system of neutrality and impartiality among the belligerent powers of Europe which has been adopted by this Government and so solemnly sanctioned by both Houses of Congress and applauded by the legislatures of the States and the public opinion, until it shall be otherwise ordained by Congress; if a personal esteem for the French nation, formed in a residence of seven years chiefly among them, and a sincere desire to preserve the friendship which has been so much for the honor and interest of both nations; if, while the conscious honor and integrity of the people of America and the internal sentiment of their own power and energies must be preserved, an earnest endeavor to investigate every just cause and remove every colorable pretense of complaint; if an intention to pursue by amicable negotiation a reparation for the injuries that have been committed on the commerce of our fellow-citizens by whatever nation, and if success can not be obtained, to lay the facts before the Legislature, that they may consider what further measures the honor and interest of the Government and its constituents demand; if a resolution to do justice as far as may depend upon me, at all times and to all nations, and maintain peace, friendship, and benevolence with all the world; if an unshaken confidence in the honor, spirit, and resources of the American people, on which I have so often hazarded my all and never been deceived; if elevated ideas of the high destinies of this country and of my own duties toward it, founded on a knowledge of the moral principles and intellectual improvements of the people deeply engraven on my mind in early life, and not obscured but exalted by experience and age; and, with humble reverence, I feel it to be my duty to add, if a veneration for the religion of a people who profess and call themselves Christians, and a fixed resolution to consider a decent respect for Christianity among the best recommendations for the public service, can enable me in any degree to comply with your wishes, it shall be my strenuous endeavor that this sagacious injunction of the two Houses shall not be without effect.\n\nWith this great example before me, with the sense and spirit, the faith and honor, the duty and interest, of the same American people pledged to support the Constitution of the United States, I entertain no doubt of its continuance in all its energy, and my mind is prepared without hesitation to lay myself under the most solemn obligations to support it to the utmost of my power.\n\nAnd may that Being who is supreme over all, the Patron of Order, the Fountain of Justice, and the Protector in all ages of the world of virtuous liberty, continue His blessing upon this nation and its Government and give it all possible success and duration consistent with the ends of His providence.
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Friends and Fellow Citizens:\n\nCalled upon to undertake the duties of the first executive office of our country, I avail myself of the presence of that portion of my fellow citizens which is here assembled to express my grateful thanks for the favor with which they have been pleased to look toward me, to declare a sincere consciousness that the task is above my talents, and that I approach it with those anxious and awful presentiments which the greatness of the charge and the weakness of my powers so justly inspire. A rising nation, spread over a wide and fruitful land, traversing all the seas with the rich productions of their industry, engaged in commerce with nations who feel power and forget right, advancing rapidly to destinies beyond the reach of mortal eye  -  when I contemplate these transcendent objects, and see the honor, the happiness, and the hopes of this beloved country committed to the issue, and the auspices of this day, I shrink from the contemplation, and humble myself before the magnitude of the undertaking. Utterly, indeed, should I despair did not the presence of many whom I here see remind me that in the other high authorities provided by our Constitution I shall find resources of wisdom, of virtue, and of zeal on which to rely under all difficulties. To you, then, gentlemen, who are charged with the sovereign functions of legislation, and to those associated with you, I look with encouragement for that guidance and support which may enable us to steer with safety the vessel in which we are all embarked amidst the conflicting elements of a troubled world.\n\nDuring the contest of opinion through which we have passed the animation of discussions and of exertions has sometimes worn an aspect which might impose on strangers unused to think freely and to speak and to write what they think; but this being now decided by the voice of the nation, announced according to the rules of the Constitution, all will, of course, arrange themselves under the will of the law, and unite in common efforts for the common good. All, too, will bear in mind this sacred principle, that though the will of the majority is in all cases to prevail, that will to be rightful must be reasonable; that the minority possess their equal rights, which equal law must protect, and to violate would be oppression. Let us, then, fellow citizens, unite with one heart and one mind. Let us restore to social intercourse that harmony and affection without which liberty and even life itself are but dreary things. And let us reflect that, having banished from our land that religious intolerance under which mankind so long bled and suffered, we have yet gained little if we countenance a political intolerance as despotic, as wicked, and capable of as bitter and bloody persecutions. During the throes and convulsions of the ancient world, during the agonizing spasms of infuriated man, seeking through blood and slaughter his long-lost liberty, it was not wonderful that the agitation of the billows should reach even this distant and peaceful shore; that this should be more felt and feared by some and less by others, and should divide opinions as to measures of safety. But every difference of opinion is not a difference of principle. We have called by different names brethren of the same principle. We are all Republicans, we are all Federalists. If there be any among us who would wish to dissolve this Union or to change its republican form, let them stand undisturbed as monuments of the safety with which error of opinion may be tolerated where reason is left free to combat it. I know, indeed, that some honest men fear that a republican government can not be strong, that this Government is not strong enough; but would the honest patriot, in the full tide of successful experiment, abandon a government which has so far kept us free and firm on the theoretic and visionary fear that this Government, the world's best hope, may by possibility want energy to preserve itself? I trust not. I believe this, on the contrary, the strongest Government on earth. I believe it the only one where every man, at the call of the law, would fly to the standard of the law, and would meet invasions of the public order as his own personal concern. Sometimes it is said that man can not be trusted with the government of himself. Can he, then, be trusted with the government of others? Or have we found angels in the forms of kings to govern him? Let history answer this question.\n\nLet us, then, with courage and confidence pursue our own Federal and Republican principles, our attachment to union and representative government. Kindly separated by nature and a wide ocean from the exterminating havoc of one quarter of the globe; too high-minded to endure the degradations of the others; possessing a chosen country, with room enough for our descendants to the thousandth and thousandth generation; entertaining a due sense of our equal right to the use of our own faculties, to the acquisitions of our own industry, to honor and confidence from our fellow citizens, resulting not from birth, but from our actions and their sense of them; enlightened by a benign religion, professed, indeed, and practiced in various forms, yet all of them inculcating honesty, truth, temperance, gratitude, and the love of man; acknowledging and adoring an overruling Providence, which by all its dispensations proves that it delights in the happiness of man here and his greater happiness hereafter  -  with all these blessings, what more is necessary to make us a happy and a prosperous people? Still one thing more, fellow citizens  -  a wise and frugal Government, which shall restrain men from injuring one another, shall leave them otherwise free to regulate their own pursuits of industry and improvement, and shall not take from the mouth of labor the bread it has earned. This is the sum of good government, and this is necessary to close the circle of our felicities.\n\nAbout to enter, fellow-citizens, on the exercise of duties which comprehend everything dear and valuable to you, it is proper you should understand what I deem the essential principles of our Government, and consequently those which ought to shape its Administration. I will compress them within the narrowest compass they will bear, stating the general principle, but not all its limitations. Equal and exact justice to all men, of whatever state or persuasion, religious or political; peace, commerce, and honest friendship with all nations, entangling alliances with none; the support of the State governments in all their rights, as the most competent administrations for our domestic concerns and the surest bulwarks against antirepublican tendencies; the preservation of the General Government in its whole constitutional vigor, as the sheet anchor of our peace at home and safety abroad; a jealous care of the right of election by the people  -  a mild and safe corrective of abuses which are lopped by the sword of revolution where peaceable remedies are unprovided; absolute acquiescence in the decisions of the majority, the vital principle of republics, from which is no appeal but to force, the vital principle and immediate parent of despotism; a well disciplined militia, our best reliance in peace and for the first moments of war, till regulars may relieve them; the supremacy of the civil over the military authority; economy in the public expense, that labor may be lightly burthened; the honest payment of our debts and sacred preservation of the public faith; encouragement of agriculture, and of commerce as its handmaid; the diffusion of information and arraignment of all abuses at the bar of the public reason; freedom of religion; freedom of the press, and freedom of person under the protection of the habeas corpus, and trial by juries impartially selected. These principles form the bright constellation which has gone before us and guided our steps through an age of revolution and reformation. The wisdom of our sages and blood of our heroes have been devoted to their attainment. They should be the creed of our political faith, the text of civic instruction, the touchstone by which to try the services of those we trust; and should we wander from them in moments of error or of alarm, let us hasten to retrace our steps and to regain the road which alone leads to peace, liberty, and safety.\n\nI repair, then, fellow-citizens, to the post you have assigned me. With experience enough in subordinate offices to have seen the difficulties of this the greatest of all, I have learnt to expect that it will rarely fall to the lot of imperfect man to retire from this station with the reputation and the favor which bring him into it. Without pretensions to that high confidence you reposed in our first and greatest revolutionary character, whose preeminent services had entitled him to the first place in his country's love and destined for him the fairest page in the volume of faithful history, I ask so much confidence only as may give firmness and effect to the legal administration of your affairs. I shall often go wrong through defect of judgment. When right, I shall often be thought wrong by those whose positions will not command a view of the whole ground. I ask your indulgence for my own errors, which will never be intentional, and your support against the errors of others, who may condemn what they would not if seen in all its parts. The approbation implied by your suffrage is a great consolation to me for the past, and my future solicitude will be to retain the good opinion of those who have bestowed it in advance, to conciliate that of others by doing them all the good in my power, and to be instrumental to the happiness and freedom of all.\n\nRelying, then, on the patronage of your good will, I advance with obedience to the work, ready to retire from it whenever you become sensible how much better choice it is in your power to make. And may that Infinite Power which rules the destinies of the universe lead our councils to what is best, and give them a favorable issue for your peace and prosperity.\n\n 
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Proceeding, fellow citizens, to that qualification which the Constitution requires before my entrance on the charge again conferred on me, it is my duty to express the deep sense I entertain of this new proof of confidence from my fellow citizens at large, and the zeal with which it inspires me so to conduct myself as may best satisfy their just expectations.\n\nOn taking this station on a former occasion I declared the principles on which I believed it my duty to administer the affairs of our Commonwealth. MY conscience tells me I have on every occasion acted up to that declaration according to its obvious import and to the understanding of every candid mind.\n\nIn the transaction of your foreign affairs we have endeavored to cultivate the friendship of all nations, and especially of those with which we have the most important relations. We have done them justice on all occasions, favored where favor was lawful, and cherished mutual interests and intercourse on fair and equal terms. We are firmly convinced, and we act on that conviction, that with nations as with individuals our interests soundly calculated will ever be found inseparable from our moral duties, and history bears witness to the fact that a just nation is trusted on its word when recourse is had to armaments and wars to bridle others.\n\nAt home, fellow citizens, you best know whether we have done well or ill. The suppression of unnecessary offices, of useless establishments and expenses, enabled us to discontinue our internal taxes. These, covering our land with officers and opening our doors to their intrusions, had already begun that process of domiciliary vexation which once entered is scarcely to be restrained from reaching successively every article of property and produce. If among these taxes some minor ones fell which had not been inconvenient, it was because their amount would not have paid the officers who collected them, and because, if they had any merit, the State authorities might adopt them instead of others less approved.\n\nThe remaining revenue on the consumption of foreign articles is paid chiefly by those who can afford to add foreign luxuries to domestic comforts, being collected on our seaboard and frontiers only, and incorporated with the transactions of our mercantile citizens, it may be the pleasure and the pride of an American to ask, What farmer, what mechanic, what laborer ever sees a taxgatherer of the United States? These contributions enable us to support the current expenses of the Government, to fulfill contracts with foreign nations, to extinguish the native right of soil within our limits, to extend those limits, and to apply such a surplus to our public debts as places at a short day their final redemption, and that redemption once effected the revenue thereby liberated may, by a just repartition of it among the States and a corresponding amendment of the Constitution, be applied in time of peace to rivers, canals, roads, arts, manufactures, education, and other great objects within each State. In time of war, if injustice by ourselves or others must sometimes produce war, increased as the same revenue will be by increased population and consumption, and aided by other resources reserved for that crisis, it may meet within the year all the expenses of the year without encroaching on the rights of future generations by burthening them with the debts of the past. War will then be but a suspension of useful works, and a return to a state of peace, a return to the progress of improvement.\n\nI have said, fellow citizens, that the income reserved had enabled us to extend our limits, but that extension may possibly pay for itself before we are called on, and in the meantime may keep down the accruing interest; in all events, it will replace the advances we shall have made. I know that the acquisition of Louisiana had been disapproved by some from a candid apprehension that the enlargement of our territory would endanger its union. But who can limit the extent to which the federative principle may operate effectively? The larger our association the less will it be shaken by local passions; and in any view is it not better that the opposite bank of the Mississippi should be settled by our own brethren and children than by strangers of another family? With which should we be most likely to live in harmony and friendly intercourse?\n\nIn matters of religion I have considered that its free exercise is placed by the Constitution independent of the powers of the General Government. I have therefore undertaken on no occasion to prescribe the religious exercises suited to it, but have left them, as the Constitution found them, under the direction and discipline of the church or state authorities acknowledged by the several religious societies.\n\nThe aboriginal inhabitants of these countries I have regarded with the commiseration their history inspires. Endowed with the faculties and the rights of men, breathing an ardent love of liberty and independence, and occupying a country which left them no desire but to be undisturbed, the stream of overflowing population from other regions directed itself on these shores; without power to divert or habits to contend against it, they have been overwhelmed by the current or driven before it; now reduced within limits too narrow for the hunter's state, humanity enjoins us to teach them agriculture and the domestic arts; to encourage them to that industry which alone can enable them to maintain their place in existence and to prepare them in time for that state of society which to bodily comforts adds the improvement of the mind and morals. We have therefore liberally furnished them with the implements of husbandry and household use; we have placed among them instructors in the arts of first necessity, and they are covered with the aegis of the law against aggressors from among ourselves.\n\nBut the endeavors to enlighten them on the fate which awaits their present course of life, to induce them to exercise their reason, follow its dictates, and change their pursuits with the change of circumstances have powerful obstacles to encounter; they are combated by the habits of their bodies, prejudices of their minds, ignorance, pride, and the influence of interested and crafty individuals among them who feel themselves something in the present order of things and fear to become nothing in any other. These persons inculcate a sanctimonious reverence for the customs of their ancestors; that whatsoever they did must be done through all time; that reason is a false guide, and to advance under its counsel in their physical, moral, or political condition is perilous innovation; that their duty is to remain as their Creator made them, ignorance being safety and knowledge full of danger; in short, my friends, among them also is seen the action and counteraction of good sense and of bigotry; they too have their antiphilosophists who find an interest in keeping things in their present state, who dread reformation, and exert all their faculties to maintain the ascendancy of habit over the duty of improving our reason and obeying its mandates.\n\nIn giving these outlines I do not mean, fellow citizens, to arrogate to myself the merit of the measures. That is due, in the first place, to the reflecting character of our citizens at large, who, by the weight of public opinion, influence and strengthen the public measures. It is due to the sound discretion with which they select from among themselves those to whom they confide the legislative duties. It is due to the zeal and wisdom of the characters thus selected, who lay the foundations of public happiness in wholesome laws, the execution of which alone remains for others, and it is due to the able and faithful auxiliaries, whose patriotism has associated them with me in the executive functions.\n\nDuring this course of administration, and in order to disturb it, the artillery of the press has been leveled against us, charged with whatsoever its licentiousness could devise or dare. These abuses of an institution so important to freedom and science are deeply to be regretted, inasmuch as they tend to lessen its usefulness and to sap its safety. They might, indeed, have been corrected by the wholesome punishments reserved to and provided by the laws of the several States against falsehood and defamation, but public duties more urgent press on the time of public servants, and the offenders have therefore been left to find their punishment in the public indignation.\n\nNor was it uninteresting to the world that an experiment should be fairly and fully made, whether freedom of discussion, unaided by power, is not sufficient for the propagation and protection of truth  -  whether a government conducting itself in the true spirit of its constitution, with zeal and purity, and doing no act which it would be unwilling the whole world should witness, can be written down by falsehood and defamation. The experiment has been tried; you have witnessed the scene; our fellow citizens looked on, cool and collected; they saw the latent source from which these outrages proceeded; they gathered around their public functionaries, and when the Constitution called them to the decision by suffrage, they pronounced their verdict, honorable to those who had served them and consolatory to the friend of man who believes that he may be trusted with the control of his own affairs.\n\nNo inference is here intended that the laws provided by the States against false and defamatory publications should not be enforced; he who has time renders a service to public morals and public tranquillity in reforming these abuses by the salutary coercions of the law; but the experiment is noted to prove that, since truth and reason have maintained their ground against false opinions in league with false facts, the press, confined to truth, needs no other legal restraint; the public judgment will correct false reasoning and opinions on a full hearing of all parties; and no other definite line can be drawn between the inestimable liberty of the press and its demoralizing licentiousness. If there be still improprieties which this rule would not restrain, its supplement must be sought in the censorship of public opinion.\n\nContemplating the union of sentiment now manifested so generally as auguring harmony and happiness to our future course, I offer to our country sincere congratulations. With those, too, not yet rallied to the same point the disposition to do so is gaining strength; facts are piercing through the veil drawn over them, and our doubting brethren will at length see that the mass of their fellow citizens with whom they can not yet resolve to act as to principles and measures, think as they think and desire what they desire; that our wish as well as theirs is that the public efforts may be directed honestly to the public good, that peace be cultivated, civil and religious liberty unassailed, law and order preserved, equality of rights maintained, and that state of property, equal or unequal, which results to every man from his own industry or that of his father's. When satisfied of these views it is not in human nature that they should not approve and support them. In the meantime let us cherish them with patient affection, let us do them justice, and more than justice, in all competitions of interest; and we need not doubt that truth, reason, and their own interests will at length prevail, will gather them into the fold of their country, and will complete that entire union of opinion which gives to a nation the blessing of harmony and the benefit of all its strength.\n\nI shall now enter on the duties to which my fellow citizens have again called me, and shall proceed in the spirit of those principles which they have approved. I fear not that any motives of interest may lead me astray; I am sensible of no passion which could seduce me knowingly from the path of justice, but the weaknesses of human nature and the limits of my own understanding will produce errors of judgment sometimes injurious to your interests. I shall need, therefore, all the indulgence which I have heretofore experienced from my constituents; the want of it will certainly not lessen with increasing years. I shall need, too, the favor of that Being in whose hands we are, who led our fathers, as Israel of old, from their native land and planted them in a country flowing with all the necessaries and comforts of life; who has covered our infancy with His providence and our riper years with His wisdom and power, and to whose goodness I ask you to join in supplications with me that He will so enlighten the minds of your servants, guide their councils, and prosper their measures that whatsoever they do shall result in your good, and shall secure to you the peace, friendship, and approbation of all nations.
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Unwilling to depart from examples of the most revered authority, I avail myself of the occasion now presented to express the profound impression made on me by the call of my country to the station to the duties of which I am about to pledge myself by the most solemn of sanctions. So distinguished a mark of confidence, proceeding from the deliberate and tranquil suffrage of a free and virtuous nation, would under any circumstances have commanded my gratitude and devotion, as well as filled me with an awful sense of the trust to be assumed. Under the various circumstances which give peculiar solemnity to the existing period, I feel that both the honor and the responsibility allotted to me are inexpressibly enhanced.\n\nThe present situation of the world is indeed without a parallel and that of our own country full of difficulties. The pressure of these, too, is the more severely felt because they have fallen upon us at a moment when the national prosperity being at a height not before attained, the contrast resulting from the change has been rendered the more striking. Under the benign influence of our republican institutions, and the maintenance of peace with all nations whilst so many of them were engaged in bloody and wasteful wars, the fruits of a just policy were enjoyed in an unrivaled growth of our faculties and resources. Proofs of this were seen in the improvements of agriculture, in the successful enterprises of commerce, in the progress of manufacturers and useful arts, in the increase of the public revenue and the use made of it in reducing the public debt, and in the valuable works and establishments everywhere multiplying over the face of our land.\n\nIt is a precious reflection that the transition from this prosperous condition of our country to the scene which has for some time been distressing us is not chargeable on any unwarrantable views, nor, as I trust, on any involuntary errors in the public councils. Indulging no passions which trespass on the rights or the repose of other nations, it has been the true glory of the United States to cultivate peace by observing justice, and to entitle themselves to the respect of the nations at war by fulfilling their neutral obligations with the most scrupulous impartiality. If there be candor in the world, the truth of these assertions will not be questioned; posterity at least will do justice to them.\n\nThis unexceptionable course could not avail against the injustice and violence of the belligerent powers. In their rage against each other, or impelled by more direct motives, principles of retaliation have been introduced equally contrary to universal reason and acknowledged law. How long their arbitrary edicts will be continued in spite of the demonstrations that not even a pretext for them has been given by the United States, and of the fair and liberal attempt to induce a revocation of them, can not be anticipated. Assuring myself that under every vicissitude the determined spirit and united councils of the nation will be safeguards to its honor and its essential interests, I repair to the post assigned me with no other discouragement than what springs from my own inadequacy to its high duties. If I do not sink under the weight of this deep conviction it is because I find some support in a consciousness of the purposes and a confidence in the principles which I bring with me into this arduous service.\n\nTo cherish peace and friendly intercourse with all nations having correspondent dispositions; to maintain sincere neutrality toward belligerent nations; to prefer in all cases amicable discussion and reasonable accommodation of differences to a decision of them by an appeal to arms; to exclude foreign intrigues and foreign partialities, so degrading to all countries and so baneful to free ones; to foster a spirit of independence too just to invade the rights of others, too proud to surrender our own, too liberal to indulge unworthy prejudices ourselves and too elevated not to look down upon them in others; to hold the union of the States as the basis of their peace and happiness; to support the Constitution, which is the cement of the Union, as well in its limitations as in its authorities; to respect the rights and authorities reserved to the States and to the people as equally incorporated with and essential to the success of the general system; to avoid the slightest interference with the right of conscience or the functions of religion, so wisely exempted from civil jurisdiction; to preserve in their full energy the other salutary provisions in behalf of private and personal rights, and of the freedom of the press; to observe economy in public expenditures; to liberate the public resources by an honorable discharge of the public debts; to keep within the requisite limits a standing military force, always remembering that an armed and trained militia is the firmest bulwark of republics  -  that without standing armies their liberty can never be in danger, nor with large ones safe; to promote by authorized means improvements friendly to agriculture, to manufactures, and to external as well as internal commerce; to favor in like manner the advancement of science and the diffusion of information as the best aliment to true liberty; to carry on the benevolent plans which have been so meritoriously applied to the conversion of our aboriginal neighbors from the degradation and wretchedness of savage life to a participation of the improvements of which the human mind and manners are susceptible in a civilized state  -  as far as sentiments and intentions such as these can aid the fulfillment of my duty, they will be a resource which can not fail me.\n\nIt is my good fortune, moreover, to have the path in which I am to tread lighted by examples of illustrious services successfully rendered in the most trying difficulties by those who have marched before me. Of those of my immediate predecessor it might least become me here to speak. I may, however, be pardoned for not suppressing the sympathy with which my heart is full in the rich reward he enjoys in the benedictions of a beloved country, gratefully bestowed or exalted talents zealously devoted through a long career to the advancement of its highest interest and happiness.\n\nBut the source to which I look or the aids which alone can supply my deficiencies is in the well-tried intelligence and virtue of my fellow-citizens, and in the counsels of those representing them in the other departments associated in the care of the national interests. In these my confidence will under every difficulty be best placed, next to that which we have all been encouraged to feel in the guardianship and guidance of that Almighty Being whose power regulates the destiny of nations, whose blessings have been so conspicuously dispensed to this rising Republic, and to whom we are bound to address our devout gratitude for the past, as well as our fervent supplications and best hopes for the future.

The output tells us that this is a data.frame and we can see the first six lines of the data. The column labelled text contains the texts of the inaugural addresses.

Use the corpus() function on this set of texts to create a new corpus. The first argument to corpus() should be the inaugural object. You will also need to set the text_field to be equal to "text" so that quanteda knows that the text we are interested in is saved in that variable.

inaugural_corpus <- corpus(inaugural, text_field = "text")

Once you have constructed this corpus, use the summary() method to see a brief description of the corpus. Which inaugural address was the longest in terms of the number of sentences?

summary(inaugural_corpus)

## Corpus consisting of 59 documents, showing 59 documents:
## 
##    Text Types Tokens Sentences Year  President       FirstName
##   text1   625   1537        23 1789 Washington          George
##   text2    96    147         4 1793 Washington          George
##   text3   826   2577        37 1797      Adams            John
##   text4   717   1923        41 1801  Jefferson          Thomas
##   text5   804   2380        45 1805  Jefferson          Thomas
##   text6   535   1261        21 1809    Madison           James
##   text7   541   1302        33 1813    Madison           James
##   text8  1040   3677       121 1817     Monroe           James
##   text9  1259   4886       131 1821     Monroe           James
##  text10  1003   3147        74 1825      Adams     John Quincy
##  text11   517   1208        25 1829    Jackson          Andrew
##  text12   499   1267        29 1833    Jackson          Andrew
##  text13  1315   4158        95 1837  Van Buren          Martin
##  text14  1898   9123       210 1841   Harrison   William Henry
##  text15  1334   5186       153 1845       Polk      James Knox
##  text16   496   1178        22 1849     Taylor         Zachary
##  text17  1165   3636       104 1853     Pierce        Franklin
##  text18   945   3083        89 1857   Buchanan           James
##  text19  1075   3999       135 1861    Lincoln         Abraham
##  text20   360    775        26 1865    Lincoln         Abraham
##  text21   485   1229        40 1869      Grant      Ulysses S.
##  text22   552   1472        43 1873      Grant      Ulysses S.
##  text23   831   2707        59 1877      Hayes   Rutherford B.
##  text24  1021   3209       111 1881   Garfield        James A.
##  text25   676   1816        44 1885  Cleveland          Grover
##  text26  1352   4721       157 1889   Harrison        Benjamin
##  text27   821   2125        58 1893  Cleveland          Grover
##  text28  1232   4353       130 1897   McKinley         William
##  text29   854   2437       100 1901   McKinley         William
##  text30   404   1079        33 1905  Roosevelt        Theodore
##  text31  1437   5821       158 1909       Taft  William Howard
##  text32   658   1882        68 1913     Wilson         Woodrow
##  text33   549   1652        59 1917     Wilson         Woodrow
##  text34  1169   3719       148 1921    Harding       Warren G.
##  text35  1220   4440       196 1925   Coolidge          Calvin
##  text36  1090   3860       158 1929     Hoover         Herbert
##  text37   743   2057        85 1933  Roosevelt     Franklin D.
##  text38   725   1989        96 1937  Roosevelt     Franklin D.
##  text39   526   1519        68 1941  Roosevelt     Franklin D.
##  text40   275    633        27 1945  Roosevelt     Franklin D.
##  text41   781   2504       116 1949     Truman        Harry S.
##  text42   900   2743       119 1953 Eisenhower       Dwight D.
##  text43   621   1907        92 1957 Eisenhower       Dwight D.
##  text44   566   1541        52 1961    Kennedy         John F.
##  text45   568   1710        93 1965    Johnson   Lyndon Baines
##  text46   743   2416       103 1969      Nixon Richard Milhous
##  text47   544   1995        68 1973      Nixon Richard Milhous
##  text48   527   1369        52 1977     Carter           Jimmy
##  text49   902   2780       129 1981     Reagan          Ronald
##  text50   925   2909       123 1985     Reagan          Ronald
##  text51   795   2673       141 1989       Bush          George
##  text52   642   1833        81 1993    Clinton            Bill
##  text53   773   2436       111 1997    Clinton            Bill
##  text54   621   1806        97 2001       Bush       George W.
##  text55   772   2312        99 2005       Bush       George W.
##  text56   938   2689       110 2009      Obama          Barack
##  text57   814   2317        88 2013      Obama          Barack
##  text58   582   1660        88 2017      Trump       Donald J.
##  text59   811   2766       216 2021      Biden       Joseph R.
##                  Party
##                   none
##                   none
##             Federalist
##  Democratic-Republican
##  Democratic-Republican
##  Democratic-Republican
##  Democratic-Republican
##  Democratic-Republican
##  Democratic-Republican
##  Democratic-Republican
##             Democratic
##             Democratic
##             Democratic
##                   Whig
##                   Whig
##                   Whig
##             Democratic
##             Democratic
##             Republican
##             Republican
##             Republican
##             Republican
##             Republican
##             Republican
##             Democratic
##             Republican
##             Democratic
##             Republican
##             Republican
##             Republican
##             Republican
##             Democratic
##             Democratic
##             Republican
##             Republican
##             Republican
##             Democratic
##             Democratic
##             Democratic
##             Democratic
##             Democratic
##             Republican
##             Republican
##             Democratic
##             Democratic
##             Republican
##             Republican
##             Democratic
##             Republican
##             Republican
##             Republican
##             Democratic
##             Democratic
##             Republican
##             Republican
##             Democratic
##             Democratic
##             Republican
##             Democratic

Joe Biden’s had the largest number of sentences.

Note that although we specified text_field = "text" when constructing the corpus, we have not removed the metadata associated with the texts. To access the other variables, we can use the docvars() function applied to the corpus object that we created above. Try this now.

head(docvars(inaugural_corpus))

##   Year  President FirstName                 Party
## 1 1789 Washington    George                  none
## 2 1793 Washington    George                  none
## 3 1797      Adams      John            Federalist
## 4 1801  Jefferson    Thomas Democratic-Republican
## 5 1805  Jefferson    Thomas Democratic-Republican
## 6 1809    Madison     James Democratic-Republican

3. Tokenizing texts

In order to count word frequencies, we first need to split the text into words (or longer phrases) through a process known as tokenization. Look at the documentation for quanteda’s tokens() function.

Use the tokens command on inaugural_corpus object, and examine the results.

inaugural_tokens <- tokens(inaugural_corpus)

Experiment with some of the arguments of the tokens() function, such as remove_punct and remove_numbers.

inaugural_tokens <- tokens(inaugural_corpus, remove_punct = TRUE, remove_numbers = TRUE)

Try tokenizing the sentences from inaugural_corpus into sentences, using tokens(x, what = "sentence").

inaugural_sentences <- tokens(inaugural_corpus, what = "sentence")
inaugural_sentences[1:2]

4. Explore some phrases in the text.

quanteda provides a keyword-in-context function that is easily usable and configurable to explore texts in a descriptive way. Use the kwic() function (for “keywords-in-context”) to explore how a specific word or phrase is used in this corpus (use the word-based tokenization that you implemented above). You can look at the help file (?kwic) to see the arguments that the function takes.

kwic(inaugural_tokens, "terror", 3)

## Keyword-in-context with 8 matches.                                                           
##   [text3, 1190]                   or violence by | terror |
##    [text37, 99] nameless unreasoning unjustified | terror |
##   [text39, 257]                  by a fatalistic | terror |
##   [text44, 761]             uncertain balance of | terror |
##   [text49, 700]               Americans from the | terror |
##   [text53, 921]                the fanaticism of | terror |
##  [text53, 1454]           strong defense against | terror |
##  [text56, 1442]                 aims by inducing | terror |
##                            
##  intrigue or venality      
##  which paralyzes needed    
##  we proved that            
##  that stays the            
##  of runaway living         
##  And they torment          
##  and destruction Our       
##  and slaughtering innocents

Try substituting your own search terms, or working with your own corpus.

head(kwic(inaugural_tokens, "america", 3))

## Keyword-in-context with 6 matches.                                                                           
##    [text2, 59]      people of united | America | Previous to the           
##    [text3, 14]     middle course for | America | remained between unlimited
##   [text3, 385]         the people of | America | were not abandoned        
##  [text3, 1272]         the people of | America | have exhibited to         
##  [text3, 1791] aboriginal nations of | America | and a disposition         
##  [text3, 1929]         the people of | America | and the internal

head(kwic(inaugural_tokens, "democracy", 3))

## Keyword-in-context with 6 matches.                                                           
##  [text10, 1424] a confederated representative | democracy |
##   [text14, 497]                    to that of | democracy |
##  [text14, 1474]       a simple representative | democracy |
##  [text14, 6900]                   the name of | democracy |
##  [text14, 7289]                of devotion to | democracy |
##   [text34, 970]      temple of representative | democracy |
##                       
##  were a government    
##  If such is           
##  or republic and      
##  they speak warning   
##  The foregoing remarks
##  to be not

By default, kwic gives exact matches for a given pattern. What if we wanted to see words like “terrorism” and “terrorist” rather than exactly “terror”? We can use the wildcard character * to exand our search by appending it to the end of the pattern we are using to search. For example, we could use "terror*". Try this now in the kwic() function.

kwic(inaugural_tokens, "terror*", 3)

## Keyword-in-context with 12 matches.                                                              
##   [text3, 1190]                   or violence by |  terror   |
##    [text37, 99] nameless unreasoning unjustified |  terror   |
##   [text39, 257]                  by a fatalistic |  terror   |
##   [text44, 761]             uncertain balance of |  terror   |
##   [text44, 872]                   instead of its |  terrors  |
##   [text49, 700]               Americans from the |  terror   |
##  [text49, 1910]               those who practice | terrorism |
##   [text53, 921]                the fanaticism of |  terror   |
##  [text53, 1454]           strong defense against |  terror   |
##  [text56, 1442]                 aims by inducing |  terror   |
##   [text58, 975]          against radical Islamic | terrorism |
##   [text59, 471]         white supremacy domestic | terrorism |
##                            
##  intrigue or venality      
##  which paralyzes needed    
##  we proved that            
##  that stays the            
##  Together let us           
##  of runaway living         
##  and prey upon             
##  And they torment          
##  and destruction Our       
##  and slaughtering innocents
##  which we will             
##  that we must

5. Creating a `dfm()`

Document-feature matrices are the standard way of representing text as quantitative data. Fortunately, it is very simple to convert the tokens objects in quanteda into dfms.

Create a document-feature matrix, using dfm applied to the immig_tokens object you created above. First, read the documentation using ?dfm to see the available options. Once you have created the dfm, use the topfeatures() function to inspect the top 20 most frequently occuring features in the dfm. What kinds of words do you see?

mydfm <- dfm(inaugural_tokens)
mydfm

## Document-feature matrix of: 59 documents, 9,351 features (91.85% sparse) and 4 docvars.
##        features
## docs    fellow-citizens  of the senate and house representatives among
##   text1               1  71 116      1  48     2               2     1
##   text2               0  11  13      0   2     0               0     0
##   text3               3 140 163      1 130     0               2     4
##   text4               2 104 130      0  81     0               0     1
##   text5               0 101 143      0  93     0               0     7
##   text6               1  69 104      0  43     0               0     0
##        features
## docs    vicissitudes incident
##   text1            1        1
##   text2            0        0
##   text3            0        0
##   text4            0        0
##   text5            0        0
##   text6            0        0
## [ reached max_ndoc ... 53 more documents, reached max_nfeat ... 9,341 more features ]

topfeatures(mydfm, 20)

##   the    of   and    to    in     a   our    we  that    be    is    it   for 
## 10183  7180  5406  4591  2827  2292  2224  1827  1813  1502  1491  1398  1230 
##    by  have which   not  with    as  will 
##  1091  1031  1007   980   970   966   944

Mostly stop words!

Experiment with different dfm_* functions, such as dfm_wordstem(), dfm_remove() and dfm_trim(). These functions allow you to reduce the size of the dfm following its construction. How does the number of features in your dfm change as you apply these functions to the dfm object you created in the question above?

dim(mydfm)

## [1]   59 9351

dim(dfm_wordstem(mydfm))

## [1]   59 5508

dim(dfm_remove(mydfm, pattern = stopwords("english")))

## [1]   59 9213

dim(dfm_trim(mydfm, min_termfreq = 5, min_docfreq = 0.01, termfreq_type = "count", docfreq_type = "prop"))

## [1]   59 2738

Use the dfm_remove() function to remove English-language stopwords from this data. You can get a list of English stopwords by using stopwords("english").

mydfm_nostops <- dfm_remove(mydfm, pattern = stopwords("en"))
mydfm_nostops

## Document-feature matrix of: 59 documents, 9,213 features (92.66% sparse) and 4 docvars.
##        features
## docs    fellow-citizens senate house representatives among vicissitudes
##   text1               1      1     2               2     1            1
##   text2               0      0     0               0     0            0
##   text3               3      1     0               2     4            0
##   text4               2      0     0               0     1            0
##   text5               0      0     0               0     7            0
##   text6               1      0     0               0     0            0
##        features
## docs    incident life event filled
##   text1        1    1     2      1
##   text2        0    0     0      0
##   text3        0    2     0      0
##   text4        0    1     0      0
##   text5        0    2     0      0
##   text6        0    1     0      1
## [ reached max_ndoc ... 53 more documents, reached max_nfeat ... 9,203 more features ]

You can easily use quanteda to subset a corpus. There is a corpus_subset() method defined for a corpus, which works just like R’s normal subset() command. For instance if you want a wordcloud of just Obama’s two inaugural addresses, you would need to subset the corpus first:

obama_corpus <- corpus_subset(inaugural_corpus, President == "Obama")
obama_tokens <- tokens(obama_corpus)
obama_dfm <- dfm(obama_tokens)
textplot_wordcloud(obama_dfm)

Try producing that plot without the stopwords and without punctuation. To remove stopwords, use dfm_remove(). To remove punctuation, pass remove_punct = TRUE to the tokens() function.

obama_tokens <- tokens(obama_corpus, remove_punct = TRUE)
obama_dfm <- dfm(obama_tokens)
obama_dfm <- dfm_remove(obama_dfm, pattern = stopwords("en"))
textplot_wordcloud(obama_dfm)

6. Descriptive statistics

We can plot the type-token ratio of the inaugural speeches over time. To do this, begin by summarising the speeches by each president by applying the summary() function to the inaugural_corpus object and examining the results.

token_info <- summary(inaugural_corpus)

Get the type-token ratio for each text, and plot the resulting vector of TTRs as a function of the Year Hint: See ?textstat_lexdiv.

inaugural_dfm <- dfm(inaugural_tokens, remove_punct = TRUE)

## Warning: '...' should not be used for tokens() arguments; use 'tokens()' first.

ttr_by_speech <- textstat_lexdiv(inaugural_dfm, "TTR")

plot(inaugural_dfm$Year, ttr_by_speech$TTR, main = "TTR by year", xlab = "Year", 
     ylab = "TTR", pch = 19, bty = "n", type = "b")

Use the corpus_subset() function to select the speeches given by presidents between 1900 and 1950. Then, using this subset, measure the term similarities (textstat_simil) for the following words: economy, health, women. Which other terms are most associated with each of these three terms?

inaugural_corpus_subset <- corpus_subset(inaugural_corpus, Year %in% c(1900:1950))

inaugural_tokens_subset <- tokens(inaugural_corpus_subset)

inaugural_dfm_subset <- dfm(inaugural_tokens_subset)
        
word_similarities <- textstat_simil(inaugural_dfm, inaugural_dfm[,c("economy", "health", "women")], 
                                    margin = "features")

## as(<dgTMatrix>, "dgeMatrix") is deprecated since Matrix 1.5-0; do as(., "unpackedMatrix") instead

head(word_similarities[order(word_similarities[, 1], decreasing = TRUE), ])

## 6 x 3 Matrix of class "dgeMatrix"
##              economy      health       women
## economy    1.0000000  0.14539594  0.24192722
## tax        0.7436018 -0.13085084  0.16173260
## logic      0.6851278 -0.08844713  0.16348013
## earn       0.5942109 -0.00861457  0.16191595
## appeals    0.5854396  0.01483933 -0.03752117
## distressed 0.5706133 -0.10928807  0.09463121

head(word_similarities[order(word_similarities[, 2], decreasing = TRUE), ])

## 6 x 3 Matrix of class "dgeMatrix"
##                     economy    health      women
## health           0.14539594 1.0000000 0.40166095
## culture          0.13216420 0.7029317 0.38533448
## denial           0.10833782 0.7006715 0.12738901
## instrumentality  0.06197579 0.6953975 0.17261874
## ideals           0.15217487 0.6804357 0.39206336
## heightened      -0.05930204 0.6570358 0.07658529

head(word_similarities[order(word_similarities[, 3], decreasing = TRUE), ])

## 6 x 3 Matrix of class "dgeMatrix"
##                 economy    health     women
## women        0.24192722 0.4016610 1.0000000
## job          0.15463850 0.1048881 0.6849277
## day          0.09069681 0.1793114 0.6553479
## bless        0.32411179 0.1349848 0.6102701
## tell        -0.06194760 0.2874803 0.6058116
## sentimental -0.05930204 0.1600472 0.5979544

7. Working with dictionaries

Dictionaries are named lists, consisting of a “key” and a set of entries defining the equivalence class for the given key. To create a simple dictionary of parts of speech, for instance we could define a dictionary consisting of articles and conjunctions, using:

pos_dict <- dictionary(list(articles = c("the", "a", "and"),
                           conjunctions = c("and", "but", "or", "nor", "for", "yet", "so")))

To let this define a set of features, we can use this dictionary on the dfm object we created above. To do so, apply the dfm_lookup() function to the relevant dfm object, with the dictionary argument equal to the pos_dict created above:

pos_dfm <- dfm_lookup(inaugural_dfm, dictionary = pos_dict)
pos_dfm[1:10,]

## Document-feature matrix of: 10 documents, 2 features (0.00% sparse) and 4 docvars.
##        features
## docs    articles conjunctions
##   text1      178           73
##   text2       15            4
##   text3      344          192
##   text4      232          109
##   text5      256          126
##   text6      166           63
## [ reached max_ndoc ... 4 more documents ]

Plot the counts of articles and conjunctions (actually, here just the coordinating conjunctions) across the speeches. (Hint: you can use docvars(inaugural_corpus, "Year")) for the x-axis.) Is the distribution of normalized articles and conjunctions relatively constant across years, as you would expect?

par(mfrow = c(1,2))
plot(inaugural_corpus$Year, as.numeric(pos_dfm[, 1]))
plot(inaugural_corpus$Year, as.numeric(pos_dfm[, 2]))

The previous analysis uses the count of articles and conjunctions, which depends on the length of the speech as longer speeches will, on average, use more articles and conjunctions. To remove this dependency, we can weight the document-feature matrix by document length and re-compute. For this, we first have to compute the full dfm (using dfm()), then weight it by document frequency (using dfm_weight() with the scheme argument equal to "prop"), and finally apply the dictionary (using dfm_lookup()). Apply these steps and then create a plot showing the weighted counts of articles and conjunctions over time.

inaugural_dfm <- dfm(inaugural_tokens)
inaugural_wgt <- dfm_weight(inaugural_dfm, scheme = "prop")
pos_wgt <- dfm_lookup(inaugural_wgt, dictionary = pos_dict)

pos_wgt[1:10, ]

## Document-feature matrix of: 10 documents, 2 features (0.00% sparse) and 4 docvars.
##        features
## docs     articles conjunctions
##   text1 0.1244755   0.05104895
##   text2 0.1111111   0.02962963
##   text3 0.1484038   0.08283003
##   text4 0.1344148   0.06315180
##   text5 0.1181902   0.05817175
##   text6 0.1412766   0.05361702
## [ reached max_ndoc ... 4 more documents ]

# For easier processing, you can turn it into a data.frame and add the Year
pos_wgt_df <- convert(pos_wgt, to = "data.frame")
pos_wgt_df <- cbind(pos_wgt_df, year = inaugural_corpus$Year)

par(mfrow = c(1,1))
plot(pos_wgt_df$year, pos_wgt_df$articles, type = "l", col = "orange",
     ylim = range(pos_wgt), xlab = "Year", ylab = "Weighted count")
lines(pos_wgt_df$year, pos_wgt_df$conjunctions, type = "l", col = "blue")
legend("topright", legend = c("articles", "conjunctions"), col = c("orange","blue"), lty = 1)

Create a new dictionary capturing a concept of your own choosing (perhaps something like “democracy” or “optimism”). Apply this dictionary to the inaugural speeches data and plot the prevalence of that concept in speeches made by US Presidents over time.

optimist_dict <- dictionary(list(optimist = c("best", "better", "great", "awesome", "good", "fantastic", "terrific", "tremendous", "amazing", "superior", "astounding", "astonishing", "extraordinary", "incredible", "magnificent")))

inaugural_dfm <- dfm(inaugural_tokens)
inaugural_wgt <- dfm_weight(inaugural_dfm, scheme = "prop")
optimist_wgt <- dfm_lookup(inaugural_wgt, dictionary = optimist_dict)

# For easier processing, you can turn it into a data.frame and add the Year
optimist_wgt_df <- convert(optimist_wgt, to = "data.frame")
optimist_wgt_df <- cbind(optimist_wgt_df, year = inaugural_corpus$Year)

par(mfrow = c(1,1))
plot(optimist_wgt_df$year, optimist_wgt_df$optimist, type = "l", col = "orange", ylim = range(optimist_wgt), xlab = "Year", ylab = "Weighted count")