Muutke küpsiste eelistusi

E-raamat: Quantitative Corpus Linguistics with R: A Practical Introduction

(University of California at Santa Barbara, USA)
  • Formaat: 286 pages
  • Ilmumisaeg: 14-Oct-2016
  • Kirjastus: Routledge
  • Keel: eng
  • ISBN-13: 9781317597667
  • Formaat - PDF+DRM
  • Hind: 70,19 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: 286 pages
  • Ilmumisaeg: 14-Oct-2016
  • Kirjastus: Routledge
  • Keel: eng
  • ISBN-13: 9781317597667

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

As in its first edition, the new edition of Quantitative Corpus Linguistics with R demonstrates how to process corpus-linguistic data with the open-source programming language and environment R. Geared in general towards linguists working with observational data, and particularly corpus linguists, it introduces R programming with emphasis on:

  • data processing and manipulation in general;
  • text processing with and without regular expressions of large bodies of textual and/or literary data, and;
  • basic aspects of statistical analysis and visualization.

This book is extremely hands-on and leads the reader through dozens of small applications as well as larger case studies. Along with an array of exercise boxes and separate answer keys, the text features a didactic sequential approach in case studies by way of subsections that zoom in to every programming problem. The companion website to the book contains all relevant R code (amounting to approximately 7,000 lines of heavily commented code), most of the data sets as well as pointers to others, and a dedicated Google newsgroup. This new edition is ideal for both researchers in corpus linguistics and instructors who want to promote hands-on approaches to data in corpus linguistics courses.

List of Figures
viii
List of Tables
x
Acknowledgments xi
1 Introduction
1(6)
1.1 Why Another Introduction to Corpus Linguistics?
1(3)
1.2 Outline of the Book
4(3)
2 The Four Central Corpus-Linguistic Methods
7(14)
2.1 Corpora
7(5)
2.1.1 What Is a Corpus?
7(2)
2.1.2 What Kinds of Corpora Are There?
9(3)
2.2 Frequency Lists
12(2)
2.3 Dispersion Information
14(1)
2.4 Lexical Co-occurrence: Collocations
15(2)
2.5 (Lexico-)Grammatical Co-occurrence: Concordances
17(4)
3 An Introduction to R
21(120)
3.1 Data Structures, Functions, Arguments
25(6)
3.2 Vectors
31(18)
3.2.1 Basics
31(5)
3.2.2 Loading Vectors
36(4)
3.2.3 Accessing and Processing (Parts of) Vectors
40(8)
3.2.4 Saving Vectors
48(1)
3.3 Factors
49(2)
3.4 Data Frames
51(9)
3.4.1 Generating Data Frames in R
51(2)
3.4.2 Loading and Saving Data Frames in R
53(2)
3.4.3 Accessing and Processing (Parts of) Data Frames in R
55(5)
3.5 Lists
60(5)
3.6 Elementary Programming Issues
65(11)
3.6.1 Conditional Expressions
65(2)
3.6.2 Loops
67(2)
3.6.3 Rules of Programming
69(7)
3.7 Character/String Processing
76(35)
3.7.1 Getting Information From and Accessing Character Vectors
76(1)
3.7.2 Elementary Ways to Change Character Vectors
77(1)
3.7.3 Merging/Splitting Character Vectors Without Regular Expressions
78(2)
3.7.4 Searching and Replacing Without Regular Expressions
80(9)
3.7.5 Searching and Replacing With Regular Expressions
89(18)
3.7.6 Merging/Splitting Character Vectors With Regular Expressions
107(4)
3.8 Two Particularly Relevant Areas: Unicode and XML
111(18)
3.8.1 Some Notes on Handling Unicode
111(6)
3.8.2 Some Notes on Handling XML Data
117(12)
3.9 File and Directory Operations
129(4)
3.10 Writing Your Own Functions and Some Final Recommendations
133(8)
4 Some Basic Statistical Notions and Tests
141(36)
4.1 Introduction to Statistical Thinking
141(10)
4.1.1 Variables and Their Roles in an Analysis
142(1)
4.1.2 Variables and Their Information Value
142(1)
4.1.3 Hypotheses: Formulation and Operationalization
142(6)
4.1.4 Data Analysis
148(2)
4.1.5 Hypothesis (and Significance) Testing
150(1)
4.2 Categorical Dependent Variables
151(9)
4.2.1 No Independent Variables
151(3)
4.2.2 One Independent Categorical Variable
154(6)
4.3 Numeric Dependent Variables
160(14)
4.3.1 No Independent Variables
161(6)
4.3.2 One Independent Categorical Variable
167(3)
4.3.3 One Independent Numeric Variable
170(4)
4.4 Reporting Results
174(3)
5 Using R in Corpus Linguistics: Case Studies
177(92)
5.1 Dispersion
179(5)
5.1.1 Dispersion 1: HIV, Keeper, and Lively in the BNC
179(3)
5.1.2 Dispersion 2: Perl in a Wikipedia Entry
182(2)
5.2 Frequencies, Frequency Lists, and Key Words
184(24)
5.2.1 Character N-Grams
184(3)
5.2.2 Word N-Grams
187(2)
5.2.3 Zero-Derivation of Run and Walk in the BNC
189(3)
5.2.4 Word and Sentence Lengths in the BNC
192(2)
5.2.5 Approximating Syntactic Complexity: Fichtner's C
194(3)
5.2.6 Key Words
197(3)
5.2.7 Frequencies of -ic and -ical Adjectives
200(3)
5.2.8 Frequencies of All Word-Tag Combinations in the BNC
203(5)
5.3 Co-Occurrence Data: Collocation/Colligation/Collostruction
208(20)
5.3.1 The Collocation Alphabetical Order in the BNC
208(2)
5.3.2 Frequencies of Collocates of -ic and -ical Adjectives
210(2)
5.3.3 The Reduction of to BE Before Verbs
212(3)
5.3.4 Verb Collexemes After Must
215(3)
5.3.5 Noun Collocates After Speed Adjectives in COCA (Fiction)
218(3)
5.3.6 Collocates of Will and Shall in COHA (1810--1890)
221(4)
5.3.7 Split Infinitives
225(3)
5.4 Other Applications
228(41)
5.4.1 Corpus Conversion: the ICE-GB
228(3)
5.4.2 Three Indexing Applications
231(4)
5.4.3 Playing With CELEX
235(2)
5.4.4 Match All Numbers
237(1)
5.4.5 Retrieving Adjective Sequences From Untagged Corpora
237(5)
5.4.6 Type-Token Ratios/Vocabulary Growth: Hamlet vs. Macbeth
242(6)
5.4.7 Hyphenated Forms and Their Alternative Spellings
248(3)
5.4.8 Lexical Frequency Profiles
251(6)
5.4.9 CHAT Files 1: Eve's MLUs and ttrs
257(6)
5.4.10 CHAT Files 2: Merging Multiple Files
263(6)
6 Next Steps ...
269(2)
Appendix 271(1)
Index 272
Stefan Th. Gries is Professor of Linguistics at University of California, Santa Barbara, USA.