dbscan - Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Related Algorithms
A fast reimplementation of several density-based algorithms of the DBSCAN family. Includes the clustering algorithms DBSCAN (density-based spatial clustering of applications with noise) and HDBSCAN (hierarchical DBSCAN), the ordering algorithm OPTICS (ordering points to identify the clustering structure), shared nearest neighbor clustering, and the outlier detection algorithms LOF (local outlier factor) and GLOSH (global-local outlier score from hierarchies). The implementations use the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. An R interface to fast kNN and fixed-radius NN search is also provided. Hahsler, Piekenbrock and Doran (2019) <doi:10.18637/jss.v091.i01>.
Last updated 2 months ago
clusteringdbscandensity-based-clusteringhdbscanlofopticscpp
15.62 score 321 stars 84 dependents 1.6k scripts 33k downloadsseriation - Infrastructure for Ordering Objects Using Seriation
Infrastructure for ordering objects with an implementation of several seriation/sequencing/ordination techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT). Hahsler et al (2008) <doi:10.18637/jss.v025.i03>.
Last updated 4 months ago
combinatorial-optimizationordinationseriationfortran
14.07 score 77 stars 79 dependents 640 scripts 34k downloadsarules - Mining Association Rules and Frequent Itemsets
Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.
Last updated 1 months ago
arulesassociation-rulesfrequent-itemsets
13.99 score 194 stars 28 dependents 3.3k scripts 26k downloadsTSP - Infrastructure for the Traveling Salesperson Problem
Basic infrastructure and some algorithms for the traveling salesperson problem (also traveling salesman problem; TSP). The package provides some simple algorithms and an interface to the Concorde TSP solver and its implementation of the Chained-Lin-Kernighan heuristic. The code for Concorde itself is not included in the package and has to be obtained separately. Hahsler and Hornik (2007) <doi:10.18637/jss.v023.i02>.
Last updated 6 months ago
concorde-tsp-solvertsp
12.40 score 65 stars 99 dependents 346 scripts 25k downloadsarulesViz - Visualizing Association Rules and Frequent Itemsets
Extends package 'arules' with various visualization techniques for association rules and itemsets. The package also includes several interactive visualizations for rule exploration. Michael Hahsler (2017) <doi:10.32614/RJ-2017-047>.
Last updated 7 months ago
arulesassociation-rulesfrequent-itemsetsinteractive-visualizationsvisualization
11.00 score 54 stars 2 dependents 1.7k scripts 12k downloadsrecommenderlab - Lab for Developing and Testing Recommender Algorithms
Provides a research infrastructure to develop and evaluate collaborative filtering recommender algorithms. This includes a sparse representation for user-item matrices, many popular algorithms, top-N recommendations, and cross-validation. Hahsler (2022) <doi:10.48550/arXiv.2205.12371>.
Last updated 7 months ago
collaborative-filteringrecommender-system
10.09 score 214 stars 2 dependents 840 scripts 3.3k downloadsstream - Infrastructure for Data Stream Mining
A framework for data stream modeling and associated data mining tasks such as clustering and classification. The development of this package was supported in part by NSF IIS-0948893, NSF CMMI 1728612, and NIH R21HG005912. Hahsler et al (2017) <doi:10.18637/jss.v076.i14>.
Last updated 13 days ago
data-stream-clusteringdatastreamstream-miningcpp
10.05 score 39 stars 3 dependents 132 scripts 1.7k downloadsrBLAST - R Interface for the Basic Local Alignment Search Tool
Seamlessly interfaces the Basic Local Alignment Search Tool (BLAST) running locally to search genetic sequence data bases. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.
Last updated 3 months ago
geneticssequencingsequencematchingalignmentdataimportbioconductorbioinformaticsblast-search
8.07 score 106 stars 106 scripts 287 downloadsrBLAST - R Interface for the Basic Local Alignment Search Tool
Seamlessly interfaces the Basic Local Alignment Search Tool (BLAST) running locally to search genetic sequence data bases. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.
Last updated 3 months ago
geneticssequencingsequencematchingalignmentdataimportbioconductorbioinformaticsblast-search
8.07 score 106 stars 106 scriptsqap - Heuristics for the Quadratic Assignment Problem (QAP)
Implements heuristics for the Quadratic Assignment Problem (QAP). Although, the QAP was introduced as a combinatorial optimization problem for the facility location problem in operations research, it also has many applications in data analysis. The problem is NP-hard and the package implements a simulated annealing heuristic.
Last updated 7 months ago
combinatorial-optimizationheuristicqapquadratic-assignment-problemfortran
7.75 score 6 stars 80 dependents 7 scripts 26k downloadspomdp - Infrastructure for Partially Observable Markov Decision Processes (POMDP)
Provides the infrastructure to define and analyze the solutions of Partially Observable Markov Decision Process (POMDP) models. Interfaces for various exact and approximate solution algorithms are available including value iteration, point-based value iteration and SARSOP. Smallwood and Sondik (1973) <doi:10.1287/opre.21.5.1071>.
Last updated 4 months ago
control-theorymarkov-decision-processesoptimizationcpp
7.03 score 19 stars 21 scripts 624 downloadsstreamMOA - Interface for MOA Stream Clustering Algorithms
Interface for data stream clustering algorithms implemented in the MOA (Massive Online Analysis) framework (Albert Bifet, Geoff Holmes, Richard Kirkby, Bernhard Pfahringer (2010). MOA: Massive Online Analysis, Journal of Machine Learning Research 11: 1601-1604).
Last updated 7 months ago
clusteringdataminingdatastreamopenjdk
5.98 score 13 stars 37 scripts 615 downloadsmarkovDP - Infrastructure for Discrete-Time Markov Decision Processes (MDP)
Provides the infrastructure to work with Markov Decision Processes (MDPs) in R. The focus is on convenience in formulating MDPs, the support of sparse representations (using sparse matrices, lists and data.frames) and visualization of results. Some key components are implemented in C++ to speed up computation. Several popular solvers are implemented.
Last updated 12 days ago
control-theorymarkov-decision-processoptimizationcpp
5.51 score 7 stars 4 scriptsrRDP - Interface to the RDP Classifier
This package installs and interfaces the naive Bayesian classifier for 16S rRNA sequences developed by the Ribosomal Database Project (RDP). With this package the classifier trained with the standard training set can be used or a custom classifier can be trained.
Last updated 11 months ago
geneticssequencinginfrastructureclassificationmicrobiomeimmunooncologyalignmentsequencematchingdataimportbayesianbioconductorbioinformaticsopenjdk
5.48 score 3 stars 6 scriptsarulesCBA - Classification Based on Association Rules
Provides the infrastructure for association rule-based classification including the algorithms CBA, CMAR, CPAR, C4.5, FOIL, PART, PRM, RCAR, and RIPPER to build associative classifiers. Hahsler et al (2019) <doi:10.32614/RJ-2019-048>.
Last updated 7 months ago
association-rulesclassification
5.42 score 3 stars 1 dependents 47 scripts 1.6k downloadsrRDP - Interface to the RDP Classifier
This package installs and interfaces the naive Bayesian classifier for 16S rRNA sequences developed by the Ribosomal Database Project (RDP). With this package the classifier trained with the standard training set can be used or a custom classifier can be trained.
Last updated 5 months ago
geneticssequencinginfrastructureclassificationmicrobiomeimmunooncologyalignmentsequencematchingdataimportbayesianbioconductorbioinformaticsopenjdk
4.88 score 3 stars 6 scripts 260 downloadscba - Clustering for Business Analytics
Implements clustering techniques such as Proximus and Rock, utility functions for efficient computation of cross distances and data manipulation.
Last updated 7 months ago
4.87 score 3 dependents 171 scripts 2.7k downloadsstreamConnect - Connecting Stream Mining Components Using Sockets and Web Services
Adds functionality to connect stream mining components from package stream using sockets and Web services. The package can be used create distributed workflows and create plumber-based Web services which can be deployed on most common cloud services.
Last updated 7 months ago
4.08 score 4 stars 1 scripts 44 downloadsrEMM - Extensible Markov Model for Modelling Temporal Relationships Between Clusters
Implements TRACDS (Temporal Relationships between Clusters for Data Streams), a generalization of Extensible Markov Model (EMM). TRACDS adds a temporal or order model to data stream clustering by superimposing a dynamically adapting Markov Chain. Also provides an implementation of EMM (TRACDS on top of tNN data stream clustering). Development of this package was supported in part by NSF IIS-0948893 and R21HG005912 from the National Human Genome Research Institute. Hahsler and Dunham (2010) <doi:10.18637/jss.v035.i05>.
Last updated 7 months ago
clusteringdata-streamsequence-analysis
3.79 score 2 stars 31 scripts 621 downloadsrMSA - Interface for Popular Multiple Sequence Alignment Tools
Seamlessly interfaces the Multiple Sequence Alignment software packages ClustalW, MAFFT, MUSCLE and Kalign (downloaded separately) and provides support to calcualte distances between sequences. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.
Last updated 10 months ago
geneticssequencinginfrastructurealignmentbioinformaticssequence-alignment
3.78 score 12 stars 7 scriptsarulesSequences - Mining Frequent Sequences
Add-on for arules to handle and mine frequent sequences. Provides interfaces to the C++ implementation of cSPADE by Mohammed J. Zaki.
Last updated 7 months ago
3.66 score 12 stars 107 scripts 1.2k downloadspomdpSolve - Interface to 'pomdp-solve' for Partially Observable Markov Decision Processes
Installs an updated version of 'pomdp-solve' and provides a low-level interface. Pomdp-solve is a program to solve Partially Observable Markov Decision Processes (POMDPs) using a variety of exact and approximate value iteration algorithms. A convenient R infrastructure is provided in the separate package pomdp. Kaelbling, Littman and Cassandra (1998) <doi:10.1016/S0004-3702(98)00023-X>.
Last updated 7 months ago
control-theorymarkov-decision-processesoptimization
3.48 score 2 stars 1 dependents 3 scripts 602 downloadsarulesNBMiner - Mining NB-Frequent Itemsets and NB-Precise Rules
NBMiner is an implementation of the model-based mining algorithm for mining NB-frequent itemsets and NB-precise rules. Michael Hahsler (2006) <doi:10.1007/s10618-005-0026-2>.
Last updated 3 years ago
association-rulesopenjdk
3.48 score 6 stars 10 scripts 398 downloadsrecommenderlabJester - Jester Dataset for 'recommenderlab'
Provides the Jester Dataset for package recommenderlab.
Last updated 3 years ago
recommender-systems
2.70 score 1 scripts 214 downloadsrecommenderlabBX - Book-Crossing Dataset (BX) for 'recommenderlab'
Provides the Book-Crossing Dataset for the package recommenderlab.
Last updated 3 years ago
recommender-systems
2.70 score 1 scripts 174 downloads