From e6879c118426fe57dfdf57eb598f1db823344eb1 Mon Sep 17 00:00:00 2001
From: Soledad Galli <solegalli@protonmail.com>
Date: Mon, 10 Jul 2023 20:08:40 +0200
Subject: [PATCH 1/5] rewords rus user guide

---
 doc/under_sampling.rst | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/doc/under_sampling.rst b/doc/under_sampling.rst
index 9f2795430..d763b8084 100644
--- a/doc/under_sampling.rst
+++ b/doc/under_sampling.rst
@@ -77,6 +77,12 @@ and are meant for cleaning the feature space.
 Controlled under-sampling techniques
 ------------------------------------
 
+Controlled under-sampling techniques reduce the number of observations of the majority
+classes (targeted classes) to a number specified by the user.
+
+Random under-sampling
+^^^^^^^^^^^^^^^^^^^^^
+
 :class:`RandomUnderSampler` is a fast and easy way to balance the data by
 randomly selecting a subset of data for the targeted classes::
 
@@ -91,9 +97,9 @@ randomly selecting a subset of data for the targeted classes::
    :scale: 60
    :align: center
 
-:class:`RandomUnderSampler` allows to bootstrap the data by setting
-``replacement`` to ``True``. The resampling with multiple classes is performed
-by considering independently each targeted class::
+:class:`RandomUnderSampler` allows bootstrapping the data by setting
+``replacement`` to ``True``. When there are multiple classes, each targeted class is
+under-sampled independently::
 
   >>> import numpy as np
   >>> print(np.vstack([tuple(row) for row in X_resampled]).shape)
@@ -103,8 +109,8 @@ by considering independently each targeted class::
   >>> print(np.vstack(np.unique([tuple(row) for row in X_resampled], axis=0)).shape)
   (181, 2)
 
-In addition, :class:`RandomUnderSampler` allows to sample heterogeneous data
-(e.g. containing some strings)::
+:class:`RandomUnderSampler` works with numrical and also categorical variables
+(e.g. where the values are strings)::
 
   >>> X_hetero = np.array([['xxx', 1, 1.0], ['yyy', 2, 2.0], ['zzz', 3, 3.0]],
   ...                     dtype=object)
@@ -116,7 +122,8 @@ In addition, :class:`RandomUnderSampler` allows to sample heterogeneous data
   >>> print(y_resampled)
   [0 1]
 
-It would also work with pandas dataframe::
+:class:`RandomUnderSampler` can also take a pandas dataframe as input for
+undersampling::
 
   >>> from sklearn.datasets import fetch_openml
   >>> df_adult, y_adult = fetch_openml(

From 1b84f58234e734712ba657338ed134e54e979121 Mon Sep 17 00:00:00 2001
From: Soledad Galli <solegalli@protonmail.com>
Date: Mon, 10 Jul 2023 20:12:03 +0200
Subject: [PATCH 2/5] final touches

---
 doc/under_sampling.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/under_sampling.rst b/doc/under_sampling.rst
index d763b8084..2a9b8ff7c 100644
--- a/doc/under_sampling.rst
+++ b/doc/under_sampling.rst
@@ -77,7 +77,7 @@ and are meant for cleaning the feature space.
 Controlled under-sampling techniques
 ------------------------------------
 
-Controlled under-sampling techniques reduce the number of observations of the majority
+Controlled under-sampling techniques reduce the number of observations from the majority
 classes (targeted classes) to a number specified by the user.
 
 Random under-sampling
@@ -109,8 +109,8 @@ under-sampled independently::
   >>> print(np.vstack(np.unique([tuple(row) for row in X_resampled], axis=0)).shape)
   (181, 2)
 
-:class:`RandomUnderSampler` works with numrical and also categorical variables
-(e.g. where the values are strings)::
+:class:`RandomUnderSampler` can undersample numerical and also categorical variables
+(i.e., where the values are strings)::
 
   >>> X_hetero = np.array([['xxx', 1, 1.0], ['yyy', 2, 2.0], ['zzz', 3, 3.0]],
   ...                     dtype=object)

From 16161e10ff77c0d53633375af74ab29f717c4980 Mon Sep 17 00:00:00 2001
From: Soledad Galli <solegalli@protonmail.com>
Date: Tue, 11 Jul 2023 12:03:54 +0200
Subject: [PATCH 3/5] reword

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
---
 doc/under_sampling.rst | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/doc/under_sampling.rst b/doc/under_sampling.rst
index 2a9b8ff7c..c42c65451 100644
--- a/doc/under_sampling.rst
+++ b/doc/under_sampling.rst
@@ -109,8 +109,7 @@ under-sampled independently::
   >>> print(np.vstack(np.unique([tuple(row) for row in X_resampled], axis=0)).shape)
   (181, 2)
 
-:class:`RandomUnderSampler` can undersample numerical and also categorical variables
-(i.e., where the values are strings)::
+:class:`RandomUnderSampler` handles heterogeneous data types, i.e. numerical, categorical, date, etc.::
 
   >>> X_hetero = np.array([['xxx', 1, 1.0], ['yyy', 2, 2.0], ['zzz', 3, 3.0]],
   ...                     dtype=object)

From 7be1406fb369c7e6ab7fbfec9c845aa6bbf1cfda Mon Sep 17 00:00:00 2001
From: Soledad Galli <solegalli@protonmail.com>
Date: Tue, 11 Jul 2023 12:04:23 +0200
Subject: [PATCH 4/5] reword

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
---
 doc/under_sampling.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/under_sampling.rst b/doc/under_sampling.rst
index c42c65451..0442e0159 100644
--- a/doc/under_sampling.rst
+++ b/doc/under_sampling.rst
@@ -121,7 +121,7 @@ under-sampled independently::
   >>> print(y_resampled)
   [0 1]
 
-:class:`RandomUnderSampler` can also take a pandas dataframe as input for
+:class:`RandomUnderSampler` also supports pandas dataframes as input for
 undersampling::
 
   >>> from sklearn.datasets import fetch_openml

From 717d7ce6564439faf24098780b5ff4d3000c41c3 Mon Sep 17 00:00:00 2001
From: Soledad Galli <solegalli@protonmail.com>
Date: Tue, 11 Jul 2023 12:08:20 +0200
Subject: [PATCH 5/5] small cosmetic changes

---
 doc/under_sampling.rst | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/doc/under_sampling.rst b/doc/under_sampling.rst
index 0442e0159..a581508bf 100644
--- a/doc/under_sampling.rst
+++ b/doc/under_sampling.rst
@@ -77,8 +77,8 @@ and are meant for cleaning the feature space.
 Controlled under-sampling techniques
 ------------------------------------
 
-Controlled under-sampling techniques reduce the number of observations from the majority
-classes (targeted classes) to a number specified by the user.
+Controlled under-sampling techniques reduce the number of observations from the
+targeted classes to a number specified by the user.
 
 Random under-sampling
 ^^^^^^^^^^^^^^^^^^^^^
@@ -109,7 +109,8 @@ under-sampled independently::
   >>> print(np.vstack(np.unique([tuple(row) for row in X_resampled], axis=0)).shape)
   (181, 2)
 
-:class:`RandomUnderSampler` handles heterogeneous data types, i.e. numerical, categorical, date, etc.::
+:class:`RandomUnderSampler` handles heterogeneous data types, i.e. numerical,
+categorical, dates, etc.::
 
   >>> X_hetero = np.array([['xxx', 1, 1.0], ['yyy', 2, 2.0], ['zzz', 3, 3.0]],
   ...                     dtype=object)