-
Notifications
You must be signed in to change notification settings - Fork 60
feat: add support for creating a Matrix Factorization model #1330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 4 commits
Commits
Show all changes
153 commits
Select commit
Hold shift + click to select a range
6783a0a
docs: update title of pypi notebook example to reflect use of the PyP…
tswast 1d39560
feat: add support for creating a Matrix Factorization model
rey-esp e19c262
feat: add support for creating a Matrix Factorization model
rey-esp 1bef4a2
feat: add support for creating a Matrix Factorization model
rey-esp d157cd7
Merge branch 'main' into b338873783-matrix-factorization
rey-esp e336bde
Update bigframes/ml/decomposition.py
rey-esp d5f713a
Update bigframes/ml/decomposition.py
rey-esp 5e3e443
Update bigframes/ml/decomposition.py
rey-esp 34a60bc
Merge branch 'main' into b338873783-matrix-factorization
rey-esp c116e8a
rating_col
rey-esp dedef39
(nearly) complete class
rey-esp e5165a9
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 05eb854
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 2787178
removem print()
rey-esp 8c66e07
removem print()
rey-esp 086b4dd
adding recommend
rey-esp 8ed3ccd
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 1b4eef9
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 7c371ac
remove hyper parameter runing references
rey-esp 7498c8c
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 55ef06a
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 29805b5
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 8de384a
swap predict in _mf for recommend
rey-esp 647532b
recommend -> predict
rey-esp b340c4f
update predict doc string
rey-esp 580de41
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 29ee357
Merge branch 'main' into b338873783-matrix-factorization
rey-esp bac2ece
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 3f22c23
Merge branch 'b338873783-matrix-factorization' of github.com:googleap…
rey-esp 213f11d
Merge branch 'main' into b338873783-matrix-factorization
rey-esp aaf0d1f
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 4c90c1d
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 792bd64
Merge branch 'b338873783-matrix-factorization' of github.com:googleap…
rey-esp ed279be
Merge branch 'main' into b338873783-matrix-factorization
rey-esp ba5beb3
preparing test files
rey-esp 86fb956
Merge branch 'main' into b338873783-matrix-factorization
rey-esp a29bbcf
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 8577833
add test data
rey-esp a92007c
Merge branch 'main' into b338873783-matrix-factorization
rey-esp a808429
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 4b7b4db
new error: to_gbq column names need to be changed?
rey-esp 8d55eac
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 9195658
Merge branch 'main' into b338873783-matrix-factorization
rey-esp faa4d6b
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 76a9934
Merge branch 'b338873783-matrix-factorization' of github.com:googleap…
rey-esp bef7808
Delete demo.ipynb
rey-esp f18104d
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 9b39a99
Merge branch 'b338873783-matrix-factorization' of github.com:googleap…
rey-esp 0dd033d
passing system test
rey-esp 60faed1
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 1f85b75
preparing to add unit tests
rey-esp 7efc63d
Merge branch 'main' into b338873783-matrix-factorization
rey-esp a457639
2 out of 3 (so far) passing unit tests
rey-esp 89790ac
Merge branch 'main' into b338873783-matrix-factorization
rey-esp a057a8f
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 512332e
attempted mocking
rey-esp 741e749
Merge branch 'main' into b338873783-matrix-factorization
rey-esp f902131
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 310257d
Merge branch 'b338873783-matrix-factorization' of github.com:googleap…
rey-esp 408e807
fix tests
rey-esp 19e423b
new test file for model creation unit tests
rey-esp 2c107df
Merge branch 'main' into b338873783-matrix-factorization
rey-esp c7c8eea
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 5f1a19a
add unit tests for num_factors, user_col, and item_col
rey-esp 68e308b
Merge branch 'b338873783-matrix-factorization' of github.com:googleap…
rey-esp 33f3069
Update tests/unit/ml/test_matrix_factorization.py
rey-esp 1ff6aaa
Update tests/unit/ml/test_matrix_factorization.py
rey-esp 446712b
Merge branch 'main' into b338873783-matrix-factorization
rey-esp c84dd7e
uncomment one test
rey-esp 3473037
uncomment test
rey-esp b3809e5
uncomment test
rey-esp 7e8a5b6
uncomment test
rey-esp eba88d9
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 8599d88
nearly all tests
rey-esp 8ab8818
tests complete and passing
rey-esp b4d3578
seeing if test causes kokoro failure
rey-esp a63cb90
uncomment test-kokoro still failing
rey-esp 3695f80
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 336bffd
Merge branch 'tswast-patch-1' into b338873783-matrix-factorization
rey-esp bb6130a
Merge branch 'main' into b338873783-matrix-factorization
rey-esp e69438d
remove comment
rey-esp 05da834
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 087953f
fix test
rey-esp bfe9140
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 8d3599e
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 248a3b1
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 157daea
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 8912663
test kokoro
rey-esp 35a8c18
test_decomposition.py failing and now feedback_type attr does not exist
rey-esp ac182be
Merge branch 'main' into b338873783-matrix-factorization
rey-esp ff58ff5
passing tests
rey-esp f0a6ba2
Update bigframes/ml/decomposition.py
rey-esp aaad5f5
Merge branch 'main' into b338873783-matrix-factorization
rey-esp b586c5c
Update tests/system/large/ml/test_decomposition.py
rey-esp 04ddd5e
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 8e875ae
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 565138a
doc attempt - _mf.py example
rey-esp b39661f
Merge branch 'b338873783-matrix-factorization' of github.com:googleap…
rey-esp c0ef08f
feedback_type case ignore
rey-esp 4b53b04
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 342cbd1
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 8812f33
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 24b8e0c
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 664de04
Update _mf.py - remove global_explain()
rey-esp 63e8e9c
fit
rey-esp 3e52cd4
pull?
rey-esp c2e9a5f
W
rey-esp 28c4602
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 1240eeb
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 46f1ea6
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 193b9c8
fix docs (maybe)
rey-esp 5a547f8
Update test_matrix_factorization.py with updated error messages
rey-esp 23d8fc8
ilnt
rey-esp ed99ad7
Update test_matrix_factorization.py - add 'f'
rey-esp e305950
improve errors and update tests
rey-esp 411fe1a
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 4273a99
Merge branch 'main' into b338873783-matrix-factorization
rey-esp b9f6a52
Merge branch 'main' into b338873783-matrix-factorization
rey-esp b92ed1f
Merge branch 'main' into b338873783-matrix-factorization
rey-esp aaf34eb
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 46601c4
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 0823db2
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 32917e5
Update tests/system/large/ml/test_decomposition.py
rey-esp e485d3b
Update bigframes/ml/decomposition.py - num_factors error messsage
rey-esp 6a27083
Update bigframes/ml/decomposition.py - user_col error message
rey-esp 6e2d902
Update bigframes/ml/decomposition.py - rating_col error message
rey-esp b65c637
Update bigframes/ml/decomposition.py - l2_reg error msg
rey-esp 93ac0fa
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 74ebe27
fix tests to match updated error messages
rey-esp b2ebcf7
Merge branch 'b338873783-matrix-factorization' of github.com:googleap…
rey-esp 3f40763
Update third_party/bigframes_vendored/sklearn/decomposition/_mf.py - …
rey-esp 2cbc2e3
Update third_party/bigframes_vendored/sklearn/decomposition/_mf.py - …
rey-esp 0a5aefb
Update third_party/bigframes_vendored/sklearn/decomposition/_mf.py - …
rey-esp d484f77
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 366e0ab
Update third_party/bigframes_vendored/sklearn/decomposition/_mf.py
tswast 1eaa708
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 56ee623
remove errors and tests
rey-esp c942418
Update bigframes/ml/decomposition.py
rey-esp e0ef53e
Update bigframes/ml/decomposition.py
rey-esp 5018182
Update bigframes/ml/decomposition.py
rey-esp c088a76
Merge branch 'main' into b338873783-matrix-factorization
rey-esp f9397f1
passing system test
rey-esp b439120
E AssertionError: expected call not found.
rey-esp ffe0f33
Merge branch 'main' into b338873783-matrix-factorization
rey-esp b2698ef
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 69c8fba
Merge branch 'main' into b338873783-matrix-factorization
rey-esp 8a614c5
same # of elements in each
rey-esp 9d71c86
Merge branch 'main' into b338873783-matrix-factorization
rey-esp c2b4795
attempt
rey-esp cd20ffc
Merge branch 'main' into b338873783-matrix-factorization
rey-esp cf6e5be
doc fix
rey-esp da230b4
doc fix
rey-esp 8927072
Merge branch 'main' into b338873783-matrix-factorization
rey-esp File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
97 changes: 97 additions & 0 deletions
97
third_party/bigframes_vendored/sklearn/decomposition/_mf.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
""" Matrix Factorization. | ||
""" | ||
|
||
# Author: Alexandre Gramfort <alexandre.gramfort@inria.fr> | ||
# Olivier Grisel <olivier.grisel@ensta.org> | ||
# Mathieu Blondel <mathieu@mblondel.org> | ||
# Denis A. Engemann <denis-alexander.engemann@inria.fr> | ||
# Michael Eickenberg <michael.eickenberg@inria.fr> | ||
# Giorgio Patrini <giorgio.patrini@anu.edu.au> | ||
# | ||
# License: BSD 3 clause | ||
|
||
from abc import ABCMeta | ||
|
||
from bigframes_vendored.sklearn.base import BaseEstimator | ||
|
||
from bigframes import constants | ||
|
||
|
||
class MF(BaseEstimator, metaclass=ABCMeta): | ||
"""Matrix Factorization (MF). | ||
|
||
**Examples:** | ||
|
||
>>> import bigframes.pandas as bpd | ||
>>> from bigframes.ml.decomposition import MF | ||
>>> X = bpd.DataFrame([[1, 1], [2, 1], [3, 1.2], [4, 1], [5, 0.8], [6, 1]]) | ||
rey-esp marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
>>> model = MF(n_components=2, init='random', random_state=0) | ||
>>> W = model.fit_transform(X) | ||
>>> H = model.components_ | ||
|
||
Args: | ||
n_components (int, float or None, default None): | ||
Number of components to keep. If n_components is not set, all | ||
components are kept, n_components = min(n_samples, n_features). | ||
If 0 < n_components < 1, select the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by n_components. | ||
num_factors (int or auto, default auto): | ||
Specifies the number of latent factors to use. | ||
If you aren't running hyperparameter tuning, then you can specify an INT64 value between 2 and 200. The default value is log2(n), where n is the number of training examples. | ||
rey-esp marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
user_col (str): | ||
The user column name. | ||
item_col (str): | ||
The item column name. | ||
l2_reg (float, default 1.0): | ||
If you aren't running hyperparameter tuning, then you can specify a FLOAT64 value. The default value is 1.0. | ||
rey-esp marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
If you are running hyperparameter tuning, then you can use one of the following options: | ||
The HPARAM_RANGE keyword and two FLOAT64 values that define the range to use for the hyperparameter. For example, L2_REG = HPARAM_RANGE(1.5, 5.0). | ||
The HPARAM_CANDIDATES keyword and an array of FLOAT64 values that provide discrete values to use for the hyperparameter. For example, L2_REG = HPARAM_CANDIDATES([0, 1.0, 3.0, 5.0]). | ||
rey-esp marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
""" | ||
|
||
def fit(self, X, y=None): | ||
"""Fit the model according to the given training data. | ||
|
||
Args: | ||
X (bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series): | ||
Series or DataFrame of shape (n_samples, n_features). Training vector, | ||
where `n_samples` is the number of samples and `n_features` is | ||
the number of features. | ||
|
||
y (default None): | ||
Ignored. | ||
|
||
Returns: | ||
PCA: Fitted estimator. | ||
""" | ||
raise NotImplementedError(constants.ABSTRACT_METHOD_ERROR_MESSAGE) | ||
|
||
def score(self, X=None, y=None): | ||
"""Calculate evaluation metrics of the model. | ||
|
||
.. note:: | ||
|
||
Output matches that of the BigQuery ML.EVALUATE function. | ||
See: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-evaluate#matrix_factorization_models | ||
for the outputs relevant to this model type. | ||
|
||
Args: | ||
X (default None): | ||
Ignored. | ||
|
||
y (default None): | ||
Ignored. | ||
Returns: | ||
bigframes.dataframe.DataFrame: DataFrame that represents model metrics. | ||
""" | ||
raise NotImplementedError(constants.ABSTRACT_METHOD_ERROR_MESSAGE) | ||
|
||
def predict(self, X): | ||
"""Predict the closest cluster for each sample in X. | ||
|
||
Args: | ||
X (bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series): | ||
Series or a DataFrame to predict. | ||
|
||
Returns: | ||
bigframes.dataframe.DataFrame: Predicted DataFrames.""" | ||
raise NotImplementedError(constants.ABSTRACT_METHOD_ERROR_MESSAGE) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.