Separates a data into training and testing datasets — separate

This function separates the abstracts into training and testing sets.

separate_training(data, percentage = 0.1)

Arguments

data: The csv file contains abstracts with
percentage: percentage of separation training sets. If percentage set to be 0.1, 10% of the data will be the training set and 90% of the data will be testing set.

Value

A list with two the train and test data.frames.

Examples

separate_training(abstracts, percentage = 0.4)
#> $train
#> # A tibble: 20 × 8
#>    Title       Authors Abstract `Published Year` `Published Month` Journal DOI  
#>    <chr>       <chr>   <chr>               <dbl> <chr>             <chr>   <chr>
#>  1 Sex differ… Bugiar… Aims Pr…             2023 NA                Cardio… 10.1…
#>  2 Clinical F… Yuan, … Backgro…             2021 NA                Blood … 10.1…
#>  3 A retrospe… Manier… Purpose…             2022 NA                Irish … 10.1…
#>  4 Factors As… El Mou… Backgro…             2022 NA                Nephron 10.1…
#>  5 Role of in… Kim, I… Backgro…             2023 NA                Fronti… 10.3…
#>  6 Risk facto… Contre… Backgro…             2023 NA                BMC Ne… 10.1…
#>  7 Acute Kidn… Alessa… Introdu…             2021 NA                Blood … 10.1…
#>  8 Clinical R… Hernán… Little …             2021 NA                Fronti… 10.3…
#>  9 Acute kidn… Ibrahi… Backgro…             2021 NA                Anaest… 10.3…
#> 10 Associatio… Romaní… Introdu…             2022 NA                Journa… 10.3…
#> 11 Acute Kidn… Bandel… Purpose…             2022 NA                Intern… 10.2…
#> 12 Mortality,… Al Owe… COVID-1…             2023 NA                Journa… 10.3…
#> 13 Acute kidn… Magalh… Introdu…             2023 NA                Intern… 10.1…
#> 14 Critical r… Li, X.… Acute k…             2021 NA                Journa… 10.1…
#> 15 Acute Kidn… Sabagh… Introdu…             2022 NA                Irania… 10.5…
#> 16 Predictive… Kim, S… Backgro…             2022 NA                Medici… 10.3…
#> 17 A Prospect… Rostam… Backgro…             2022 NA                Fronti… 10.3…
#> 18 Frequency … Rashid… Backgro…             2023 NA                Shiraz… 10.5…
#> 19 CHARACTERI… Rolón,… We cond…             2022 NA                Medici… NA   
#> 20 COVID-19 i… Pawelk… Backgro…             2021 NA                Infect… 10.1…
#> # ℹ 1 more variable: `Covidence #` <chr>
#> 
#> $test
#> # A tibble: 30 × 8
#>    Title       Authors Abstract `Published Year` `Published Month` Journal DOI  
#>    <chr>       <chr>   <chr>               <dbl> <chr>             <chr>   <chr>
#>  1 "Developme… Palomb… Purpose…             2023 NA                BMC Ne… 10.1…
#>  2 "Clinical … Bougue… Backgro…             2023 NA                Journa… 10.3…
#>  3 "\"Acute k… De La … This st…             2023 NA                PLoS O… 10.1…
#>  4 "Outcomes … Al-Ome… Backgro…             2023 NA                Journa… 10.4…
#>  5 "Circulati… van Li… Introdu…             2023 NA                ERJ Op… 10.1…
#>  6 "Kinetics … Greco,… Backgro…             2023 NA                Diagno… 10.3…
#>  7 "Statin th… Piani,… Backgro…             2023 NA                Nutrit… 10.1…
#>  8 "Acute kid… Eldabo… Backgro…             2023 NA                Multid… 10.4…
#>  9 "Acute kid… Ounci,… Introdu…             2022 NA                Journa… 10.1…
#> 10 "The incid… Bayrak… Introdu…             2022 NA                Therap… 10.1…
#> # ℹ 20 more rows
#> # ℹ 1 more variable: `Covidence #` <chr>
#>