05 Jan 04:46

ckdckd145

fa04e7f

1.8.1.4

Deprecating

`group_names` parameter

group_names parameter in .progress() was deprecated.

If you want to select DataFrame, use selector parameter please.

`FigureInStatmanager` Class

Now FigureInStatmanager class is deprecated.

All figures generated by running .figure() or .progress() is matplotlib..axes.Axes or seaborn.FacetGrid

This is probably more useful than writing complicated code to make it compatible.

Even for users who aren't familiar with the dependent libraries, a styled figure is returned by default.

Of course, if you're familiar with seaborn and matplotlib, you'll want to feel free to manipulate the properties of the returned object.

`group_names` 매개변수

.progress()의 group_names` 파라미터는 더 이상 사용되지 않습니다.

특정 조건에 다라 DataFrame을 조정하려면 selector 파라미터를 대신 사용하시기 바랍니다.

`FigureInStatmanager` 클래스

이제 FigureInStatmanager 클래스는 더 이상 사용되지 않습니다.

.figure() 혹은 .progress9)에서 생성된 모든 그래프와 그림들은 이제 matplotlib.Axes 혹은 seabron.FacetGrid 객체로 반환됩니다.

모든 matplotlib 및 seaborn 메소드를 적용하여 그래프의 속성을 조절하길 바랍니다.

New analysis

Hierarchical linear regression is available now.

If you enter the hier_linearr in method parameter in .progress(), hierarchical linear regression will run.

The type of arguments entered in vars parameter in .progress() must be list. You can make a “step” distinct by providing a list as an element within a list. For example, if you want to make hierarchical linear regression model predicting ‘income’ by entering some variables, your code should look like this:

import pandas as pd
from statmanager import Stat_Manager

df = pd.read_csv(r"../..", index_col = 'id')
sm = Stat_Manager(df)

step_1 = ['age', 'sex', 'education']      # ivs entered in step 1
step_2 = ['location', 'job', 'marriage']  # ivs added in step 2 with step 1 variables

sm.progress(method = 'hier_linearr', vars = ['income', step_1, step_2])

Also, .figure() is avaiable. statmanager-kr will show residual plot of last regression model.

The result of hierarchical regression will show this :

	added_vars	R-squared of Model	p-value of Model	R-squared increased	F	p-value of F
Step 1	None	0.209	0.192	NaN	NaN	NaN
Step 2	…..	0.222	0.270	0.013	0.401	0.533

	Step 1	Step 2
Model:	OLS	OLS
Dependent Variable:	income	income
Date:	2024-01-05 11:53	2024-01-05 11:53
No. Observations:	30	30
Df Model:	4	5
Df Residuals:	25	24
R-squared:	0.209	0.222
Adj. R-squared:	0.083	0.060
AIC:	151.1306	152.6380
BIC:	158.1366	161.0452
Log-Likelihood:	-70.565	-70.319
F-statistic:	1.656	1.373
Prob (F-statistic):	0.192	0.270
Scale:	7.7586	7.9502
Omnibus:	2.238	2.075
Prob(Omnibus):	0.327	0.354
Skew:	0.535	0.534
Kurtosis:	2.340	2.402
Durbin-Watson:	1.752	1.760
Jarque-Bera (JB):	1.973	1.871
Prob(JB):	0.373	0.392
Condition No.:	2323	2477

	Step 1						Step 2
	Coef.	Std.Err.	t	P>\|t\|	[0.025	0.975]	Coef.	Std.Err.	t	P>\|t\|	[0.025	0.975]
const	10.828	3.582	3.023	0.006	3.451	18.205	9.979	3.868	2.580	0.016	1.996	17.962
var_name	-0.168	0.088	-1.908	0.068	-0.349	0.013	-0.154	0.091	-1.686	0.105	-0.343	0.035
var_name	-0.002	0.006	-0.252	0.803	-0.014	0.011	-0.001	0.006	-0.233	0.818	-0.014	0.011
var_name	-0.116	0.190	-0.614	0.545	-0.507	0.274	-0.153	0.200	-0.762	0.453	-0.566	0.261
var_name	-1.769	1.084	-1.633	0.115	-4.001	0.463	-1.703	1.102	-1.546	0.135	-3.977	0.570
..(added var)	NaN	NaN	NaN	NaN	NaN	NaN	0.138	0.218	0.630	0.534	-0.313	0.588

Improvement

Now levene test and fmax test work when the type of argument entered in group_vars parameter was list having more than 2 items. Also, .figure() is available.

Bug fix

Bug fixed in logistic regression and Yuen’s t-test.

새로운 분석

이제 Hierarchical Linear Regression의 적용이 가능합니다.

.progress()의 method 매개 변수에 hier_linearr을 입력하면 계층적 선형 회귀가 실행됩니다.

.progress()의 vars 매개 변수에 입력하는 인수의 유형은 반드시 list여야 합니다. list 내의 요소로 list를 제공하여 'STEP'을 구분할 수 있습니다. 예를 들어 몇 가지 변수를 입력하여 '소득'을 예측하는 계층적 선형 회귀 모델을 만들고자 한다면 코드는 다음과 같아야 합니다:

import pandas as pd
from statmanager import Stat_Manager

df = pd.read_csv(r"../..", index_col = 'id')
sm = Stat_Manager(df)

step_1 = ['age', 'sex', 'education']      # ivs entered in step 1
step_2 = ['location', 'job', 'marriage']  # ivs added in step 2 with step 1 variables

sm.progress(method = 'hier_linearr', vars = ['income', step_1, step_2])

또한, .figure()를 체인 메소드닝으로 입력해도 정상적으로 동작 합니다.

이 경우, 마지막 Step에서 구성된 regression model에 대한 residual plot이 출력됩니다.

Hierarchical Linear Regression의 결과 출력 예시는 다음을 참고하십시오.

	added_vars	R-squared of Model	p-value of Model	R-squared increased	F	p-value of F
Step 1	None	0.209	0.192	NaN	NaN	NaN
Step 2	…..	0.222	0.270	0.013	0.401	0.533

	Step 1	Step 2
Model:	OLS	OLS
Dependent Variable:	income	income
Date:	2024-01-05 11:53	2024-01-05 11:53
No. Observations:	30	30
Df Model:	4	5
Df Residuals:	25	24
R-squared:	0.209	0.222
Adj. R-squared:	0.083	0.060
AIC:	151.1306	152.6380
BIC:	158.1366	161.0452
Log-Likelihood:	-70.565	-70.319
F-statistic:	1.656	1.373
Prob (F-statistic):	0.192	0.270
Scale:	7.7586	7.9502
Omnibus:	2.238	2.075
Prob(Omnibus):	0.327	0.354
Skew:	0.535	0.534
Kurtosis:	2.340	2.402
Durbin-Watson:	1.752	1.760
Jarque-Bera (JB):	1.973	1.871
Prob(JB):	0.373	0.392
Condition No.:	2323	2477

	Step 1						Step 2
	Coef.	Std.Err.	t	P>\|t\|	[0.025	0.975]	Coef.	Std.Err.	t	P>\|t\|	[0.025	0.975]
const	10.828	3.582	3.023	0.006	3.451	18.205	9.979	3.868	2.580	0.016	1.996	17.962
var_name	-0.168	0.088	-1.908	0.068	-0.349	0.013	-0.154	0.091	-1.686	0.105	-0.343	0.035
var_name	-0.002	0.006	-0.252	0.803	-0.014	0.011	-0.001	0.006	-0.233	0.818	-0.014	0.011
var_name	-0.116	0.190	-0.614	0.545	-0.507	0.274	-0.153	0.200	-0.762	0.453	-0.566	0.261
var_name	-1.769	1.084	-1.633	0.115	-4.001	0.463	-1.703	1.102	-1.546	0.135	-3.977	0.570
..(added var)	NaN	NaN	NaN	NaN	NaN	NaN	0.138	0.218	0.630	0.534	-0.313	0.588

개선

이제 Levene test 및 fmax test에서도 2개 이상의 요소를 가진 list가 group_vars에 대한 인자로 제공되더라도 작동합니다.

마찬가지로, 관련된 .figure()도 정상 작동합니다.

버그 픽스

Logistic Regression과 Yuen’s t-test에서 발견된 버그가 수정되었습니다.

Assets 2

26 Dec 07:27

ckdckd145

1.8.1.3

bd1b2e9

1.8.1.3

Improvements

For the purpose of applying analyses to reveal differences between groups, one common approach is to look at whether the normality assumptions are met in a cross-group dataset. Until now, it was not possible to apply the analysis for the purpose of testing the normality assumption implemented in statmanager-kr via the group_vars parameter.The kstest, shapiro, and z_normal methods now work perfectly when group_vars is provided. The results of the analysis to verify the normality assumption are now printed in the form of a pandas.DataFrame.

The .figure() function has been updated accordingly, and works fine in kstest. However, shapiro and z_normal still need more work. Please note that currently, .figure() does not work correctly when a list with more than 3 elements is provided in group_vars.

Other than that, I've fixed a bunch of small bugs that I found during my work. For example, I fixed an error that prevented the function from working if you provided a variable with missing values as the dependent variable in a regression analysis (linearr or logisticr).

개선된 기능

집단 간 차이를 규명하기 위한 분석을 적용하기 위한 목적으로, 집단 간 데이터 세트에서 정규성 가정을 충족하는지 여부를 살펴보는 것은 일반적인 접근 방식 중 하나입니다. 지금까지는 정규성 가정의 충족 여부를 확인하기 위해 구현된 분석 방법들이 group_vars 파라미터가 제공될 경우 작동하지 않았습니다. 이제, 이러한 불편함이 개선되어 kstest, shapiro 및 z_normal 들이 group_vars를 제공했을 때에도 완벽하게 작동합니다. 이제 정규성 가정을 확인하기 위한 분석 결과들은 pandas.DataFrame의 형태로 제공됩니다.

이에 맞춰 .figure() 기능도 업데이트 되었고, kstest에서 정상적으로 작동합니다. 다만, shapiro 및 z_normal에서는 아직 더 수정이 필요합니다. 현재에는 group_vars에 3개 이상의 요소를 가진 list가 제공될 경우 .figure()가 정상적으로 작동하지 않으니 유의하십시오.

그 외, 작업 중에 발견된 소소한 버그들이 수정되었습니다. 예를 들면, regression 분석에 종속 변수로 결측치가 포함된 변수를 제공하는 경우 함수가 작동하지 않던 오류를 고쳤습니다 (linearr or logisticr).

Assets 2

20 Dec 08:41

ckdckd145

1.8.1.1

ce99770

1.8.1.1

New function

It is now possible to apply Yuen's two-sample t-test (Also called as Independent Samples T-test using unequal variance).

Improvements

Adjustment of figsize, font, and font_scale is now possible via the .revise() method, which is used to modify the properties of the graphs/figures produced by the analysis.

Bug Fix

Fixed a bug where pd.DataFrames embedded in the results were not saved properly when saving the output in xlsx format.

Fixed a bug where Linear Regression and Logistic Regression results were not saved properly.

Also, I found and fixed a bug where the dependent variable was not properly dummy-coded when running Multinomial Logistic Regression analysis.

추가된 기능

이제 Yuen’s two-sample t-test의 적용이 가능합니다. (이는 Independent samples T-test using unequal variance로 불리기도 합니다. )

자세한 내용은 여기서 확인하세요.

해당 t-test는 등분산성이 충족되지 않은 경우 활용할 수 있는 independent samples t-test입니다.

trim ratio를 설정함으로써 적용이 가능합니다.

method에 제공해야할 인자는 ttest_ind_trim 입니다.

bootstrap과 유사하게 인자 바로 뒤에 trim ratio를 붙여주면 됩니다.

trim ratio는 학계의 권고 및 종속된 라이브러리인 scipy.stats.ttest_ind()의 기능에 따라 0 ~ 0.5로 제한됩니다.

개선된 기능

이제 분석 결과를 통해 산출된 그래프/그림의 속성을 수정할 때 사용되는 .revise() 메소드를 통해 figsize, font, font_scale의 조정이 가능합니다.

버그 수정

출력된 결과를 xlsx 포맷으로 저장하는 경우 결과에 포함된 pd.DataFrame들이 제대로 저장되지 않던 버그를 수정했습니다.

Linear Regression 및 Logistic Regression 결과가 제대로 저장되지 않던 버그를 수정했습니다.

또한, Multinominal Logistic Regression 분석 진행시 종속변수가 제대로 dummy-code되지 않는 버그를 발견하여 수정하였습니다.

Assets 2

14 Dec 01:26

ckdckd145

1.8.1.0

19871bf

1.8.1.0

If it is possible see the original notice here

Dependency

Added a dependency on XlsxWriter, this is due to an update to the save functionality.

New functions

The functions of saving results are now available. See the details here. Also, the functions of generating graphs/figures are now fully capable to help you visualize the results of your analysis.

As with other statistical analyses, graphs can be generated by adjusting the method parameter within the .progress() method, or by using the newly created function named .figure() as a chain method on the results of analyses run through .progress() to generate graphs customized to the type of analysis and results.

For example, running sm.progress(method = 'hist', vars = 'prescore') draws a histogram.

As another example, running sm.progress(method = 'kstest', vars = 'age').figure() outputs a CDF graph along with the results of a Kolmogorov-Smirnov Test analysis.

The output graph has a new method called .revise() that allows you to change the title, xlabel, ylabel, xticks, and yticks. We've devoted a new paragraph in the documentation to explaining its usage, so check it out in more detail at that link.

Improvements

A minor bug was found and fixed in .change_dataframe().

Fixed some bugs when applying the selector parameter in .progress().

Improved the readability of the dataframes output by some analytics.

Changed features

correlation analysis now calculates correlations for each pair of input variables, instead of limiting it to data with no missing values across all input variables. Therefore, the number of n is displayed for each pair.
for pearson correlation analysis, the 95% confidence interval is output together.

종속성

XlsxWriter에 대한 종속성이 추가되었습니다. 저장 기능이 업데이트 되었기 때문입니다.

추가된 기능

이제 분석 결과를 저장하는 기능이 활용 가능합니다. 자세한 내용은 여기서 확인하세요.

또한, 분석 결과를 시각화하는 데 유용한 그래프를 산출해내는 기능을 본격적으로 활용 가능합니다.

다른 통계분석을 진행하듯, .progress() 메소드 내에서 method 파라미터를 조정함으로써 그래프를 산출할 수도 있고, .progress()를 통해 진행된 분석 결과에 새로 마련된 .figure()를 체인 메소드로 활용함으로써 분석 종류 및 결과에 맞춤화된 그래 산출할 수도 있습니다.

예를 들어, sm.progress(method = 'hist', vars = 'prescore')를 실행하면 히스토그램이 그려집니다.

또 다른 예를 들면, sm.progress(method = 'kstest', vars = 'age').figure() 를 실행하면 Kolmogorov-Smirnov Test 분석 결과와 함께 CDF 그래프가 함께 출력됩니다.

출력된 그래프는 .revise() 라는 새로운 메소드를 통해 제목, xlabel, ylabel, xticks, yticks를 변경할 수 있습니다. 이는 문서 내 새로운 단락을 할애하여 용법을 설명해두었으니, 해당 링크에서 좀 더 상세하게 살펴보세요.

Assets 2

09 Dec 05:04

ckdckd145

1.8.0.0

387c177

1.8.0.0

General announcement

I'm planning to update statmanager-kr as a package in earnest, adding more useful features as a tool for conducting research and scientific methods. Before that, I've been working on separating all of them into their own methods, as most of the current features are currently intertwined in one python file, making it difficult to modify and add features. This is finally done, and it's been verified to be bug-free, so more useful features will be added in the future. If you're curious about the changes, check out the Github repository

Chain-methoding is now possible.

All additional functions now work with chain methoding. For example, functions like sm.set_language().progress() all work fine.

New useful additional functionality : .change_dataframe()

Added additional function to change the dataframe of a Stat_Manager object, see the relevant paragraph in the official documentation for more details. Basically, it works the same as when you create the object. Naturally, it also works with chain methoding. For example, coding something like sm.chagne_dataframe().set_language().progress() will work just fine.

Fixing bugs in a few analyses

As I've been working on splitting the code into independent functions for each analysis, I've naturally found bugs in a few analytics. The most serious of which was that f_nway and f_nway_rm were missing parts of the interaction when 3-Way and above. Now, of course, they work as they should. A few other minor bugs have all been fixed as well.

Renaming some analyses and correcting output metrics.

I've renamed some of the analyses to avoid misunderstandings. For example, the 3-way repeated measures ANOVA is now called the 3-way Mixed Repeated Measures ANOVA. I've also changed the output to include more metrics for each analysis, following the APA style reporting guide. This will be an ongoing improvement.

Remove the effectsize parameter in .progress()

Previously, the effectsize parameter in .progress() during analysis could be used to calculate the effect size. While modifying the code, I realized that this was very user-unfriendly and unnecessary. This parameter has now been removed and each analysis will automatically calculate and output the effect size if possible. This means that the behavior when the effectsize parameter was true is now automatic.

전반적인 공지

본격적으로 statmanager-kr을 패키지로서 업데이트하고, 연구 및 과학적 방법을 수행하는 도구로서 유용한 기능들을 추가해나가려고 합니다. 그 전에 현재 대부분의 기능이 하나의 python 파일에서 얽혀 돌아가고 있는 점이 수정과 기능 추가를 어렵게 만들고 있었기에, 이들을 모두 각각의 메소드로 구분하는 작업을 진행했습니다. 이제 완료되었고, 버그가 없는 것으로 확인되었으므로 앞으로 더 유용한 기능들이 추가될 예정입니다. 변경사항이 궁금하다면 Github repository를 참고하세요!

체인 메소드닝이 가능합니다.

이제 모든 부가 기능은 체인 메소드닝으로 작동합니다. 예를 들어, sm.set_language().progress() 와 같은 기능이 모두 정상적으로 작동합니다.

유용한 부가 기능 추가 : .change_dataframe()

Stat_Manager 객체의 데이터프레임을 변경할 수 있는 부가 기능이 추가되었습니다. 자세한 내용은 공식 문서의 관련 단락에서 확인하시기 바랍니다. 기본적으로 객체를 생성할 때와 동일한 기능으로 작동합니다. 당연히, 이 또한 체인 메소드닝으로 작동합니다. 예를 들면, sm.chagne_dataframe().set_language().progress() 이런 식으로 코딩해도 문제없이 동작합니다.

몇몇 분석에서 발견된 버그 수정

분석 기능별 독립적인 기능으로 코드를 분할하는 작업을 수행하면서 자연스럽게 몇몇 분석에서 버그를 발견할 수 있었습니다. 그 중 가장 심각한 것은 f_nway 및 f_nway_rm에서 3-Way 이상인 경우 상호작용의 일부가 누락되는 현상이었습니다. 이제 당연히 정상적으로 작동합니다. 몇몇 소소한 버그들도 모두 수정되었습니다.

분석 이름 재정립 및 출력되는 지표 수정

몇몇 분석의 명칭이 오해를 방지하기 위해 수정되었습니다. 예를 들면, 3-way repeated measures ANOVA의 정확한 명칭은 3-way Mixed Repeated Measures ANOVA입니다. 또한, APA style의 reporting 가이드를 참고하여 각 분석별로 필요한 지표를 더욱 풍부하게 출력하도록 변경하였습니다.

.progress()의 effectsize 파라미터 삭제

기존에는 분석을 진행하는 .progress()에서 effectsize 파라미터를 활용해 효과크기를 산출할 수 있었습니다. 코드를 수정하던 중 이러한 기능이 굉장히 사용자에게 친화적이지 않고 불필요하다는 것을 깨달았습니다. 이제 이 파라미터는 삭제되었고 각 분석에서는 가능한 경우 자동으로 효과크기가 계산되어 출력됩니다.

Assets 2

05 Dec 08:18

ckdckd145

1.7.2.6

01778c8

1.7.2.6

Bootstrap 관련 기능 개선 및 버그 수정

Bootstrap percentile method의 resampling 횟수를 자유롭게 조절할 수 있습니다.

이제 .progress()에서 method = 'bootstrap리샘플할횟수' 를 입력하면 됩니다.

(예. sm.progress(method = 'bootstrap8000', vars = ['prescore', 'postscore']) — 8000번의 리샘플링)

Bootstrap된 데이터프레임을 반환하고자 하는 경우 method에 _df를 붙이면 됩니다.

(예. sm.progress(method = 'bootstrap8000_df', vars = ['prescore', 'postscore']) — 8000번의 리샘플링)

또한, bootstrap percentile method 및 bootstrap 데이터프레임 반환을 위해 .progress() 메소드를 활용할 때, group_names를 지정하지 않으면 작동되지 않던 오류를 수정했습니다.
크론바흐의 알파 계산 기능 추가

이제 크론바흐의 알파를 계산할 수 있습니다. 적용 방법은 아래와 같습니다.
sm.progress(method = 'cronbach', vars = ['item1', 'item2' , ..., ])

곧 documentation에 관련 내용이 추가될 예정입니다.
Figure 및 Graph 생성 기능 추가 (임시 구현)

이제 Figure 및 Graph를 생성하는 기능이 추가될 예정입니다.

현재는 임시로 pp-plot과 qq-plot 을 생성하는 기능을 추가했습니다.

곧 documentation에 관련 내용이 추가될 예정입니다.

예시. sm.progress(method = 'pp_plot', vars = 'prescore')

예시. sm.progress(method = 'qq_plot', vars = 'prescore')

Improvement in bootstrap related functions and bug fix

You can now freely adjust the number of resampling times for the Bootstrap percentile method.

In .progress() , method = 'bootstrap{resamplingtime}' .

(Example. sm.progress(method = 'bootstrap8000', vars = ['prescore', 'postscore']) — Resampling no. = 8,000)

If you want to return a bootstrapped dataframe, you can append _df to method.

(Example. sm.progress(method = 'bootstrap8000', vars = ['prescore', 'postscore']) — Resampling no. = 8,000)

Also, errors that occured if group_names were not specified in bootstrap percentile method or bootstrap returning were fixed
Add function : calculating cronbach’s alpha

See the example below:
sm.progress(method = 'cronbach', vars = ['item1', 'item2' , ..., ])

I'll be adding this to the documentation soon.
Add function (Temporary) : Making figures or graphs for statistic

For now, it is possible to make p-p plot or q-q plot like this :

ex. sm.progress(method = 'pp_plot', vars = 'prescore')

ex. sm.progress(method = 'qq_plot', vars = 'prescore')

I'll be adding this to the documentation soon.

Assets 2

04 Dec 07:24

ckdckd145

1.7.2.5

2c36ac1

1.7.2.5 (hot fix)

Fixed bug caused when selector parameter were used in .progress()

.progress() 에서 selector 파라미터를 사용할 경우 발생하는 버그가 발견되어 수정, 재배포됐습니다.

Assets 2

04 Dec 04:36

ckdckd145

1.7.2.4

bc3e62f

1.7.2.4

ver 1.7.2.4

Fixed a typo in the documentation link in the package
Modified menu_for_howtouse.py with dataframes to be associated with a .csv file for ease of modification.
Fixed a bug (Changing the language via set_language() caused an error in howtouse(). )
Fixed a typo in several reporting sentences

Assets 2

01 Dec 12:31

ckdckd145

1.7.2.2

7eebb8e

1.7.2.2

See more informations belows:

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Deprecating

`group_names` parameter

`FigureInStatmanager` Class

`group_names` 매개변수

`FigureInStatmanager` 클래스

New analysis

Improvement

Bug fix

새로운 분석

개선

버그 픽스

Uh oh!

Improvements

개선된 기능

Uh oh!

New function

Improvements

Bug Fix

추가된 기능

개선된 기능

버그 수정

Uh oh!

If it is possible see the original notice here

Dependency

New functions

Improvements

Changed features

종속성

추가된 기능

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ckdckd145/statmanager-kr

1.8.1.4

Deprecating

group_names parameter

FigureInStatmanager Class

group_names 매개변수

FigureInStatmanager 클래스

New analysis

Improvement

Bug fix

새로운 분석

개선

버그 픽스

Uh oh!

1.8.1.3

Improvements

개선된 기능

Uh oh!

1.8.1.1

New function

Improvements

Bug Fix

추가된 기능

개선된 기능

버그 수정

Uh oh!

1.8.1.0

If it is possible see the original notice here

Dependency

New functions

Improvements

Changed features

종속성

추가된 기능

Uh oh!

1.8.0.0

Uh oh!

1.7.2.6

Uh oh!

1.7.2.5 (hot fix)

Uh oh!

1.7.2.4

Uh oh!

1.7.2.2

Uh oh!

`group_names` parameter

`FigureInStatmanager` Class

`group_names` 매개변수

`FigureInStatmanager` 클래스