Data-driven Product Functional Configuration: Patent Data and Hypergraph

The product functional configuration (PFC) is typically used by firms to satisfy the individual requirements of customers and is realized based on market analysis. This study aims to help firms analyze functions and realize functional configurations using patent data. This study first proposes a patent-data-driven PFC method based on a hypergraph network. It then constructs a weighted network model to optimize the combination of product function quantity and object from the perspective of big data, as follows: (1) The functional knowledge contained in the patent is extracted. (2) The functional hypergraph is constructed based on the co-occurrence relationship between patents and applicants. (3) The function and patent weight are calculated from the patent applicant’s perspective and patent value. (4) A weight calculation model of the PFC is developed. (5) The weighted frequent subgraph algorithm is used to obtain the optimal function combination list. This method is applied to an innovative design process of a bathroom shower. The results indicate that this method can help firms detach optimal function candidates and develop a multifunctional product.


Introduction
To remain relevant in a competitive market, manufacturers typically strengthen product innovation and development capability to satisfy customer demands.Customers are typically attracted to multifunctional products instead of products with a single function [1,2].Therefore, firms tend to develop multifunction products (MFPs) that can fulfil the various demands of customers.Effectively designing new products and combining many functions remain important tasks for firms.In this context, the MFP design process is a challenging task as it requires engineering designers to solve the following three problems of how-how-what (2H-W): • How can functional information be obtained?• How is the function evaluated?
• Which functions are suitable to be integrated into one product?
A functional configuration is typically realized by experts, where suitable functions are combined to satisfy customer demands.Although it is a frequently used method, the subjectivity inherent in this method relies on the designer's experience and skills.In addition, some methods analyze function demands through questionnaires or surveys, which are time consuming, and the data obtained are limited.
In recent years, owing to the development of big data technology, data-driven methods have been applied in product design, including functional analysis and decision support.The quality and reliability of data-driven design results are affected by the data source.Data sources have been expanded from a single type to various data sources, such as website reviews, machine data, physiological data, and patent data.In fact, the functional configuration must not only solve the data sources, but also be supplemented by reasonable and effective

Open Access
Chinese Journal of Mechanical Engineering methods.Compared to other data sources, patent data are a vital knowledge source for function-oriented product design as it allows a significant amount of data to be accessed, as well as a close relationship with product development [3].In fact, it has become a vital knowledge source for function-oriented product design.
To achieve greater market share and margins, enterprises must continually develop new products with multiple functions.Moreover, firms must apply for patents to protect their intellectual property rights and avoid plagiarism.Many factors, such as product type, enterprises' business model, and production lead time, form a complicated relationship, as reflected in patents-this is problematic in design practice.Owing to the advantages of network theory in multi-entity relationship analysis, network-based patent analysis has recently garnered increasing attention from scholars [4].Luo et al. investigated the product design space using a patent network [5].
Based on these observations, a patent-driven method for product function deployment based on a hypergraph is proposed herein.In addition, this study is performed to combine data mining technology with networks to solve multifunctional configuration problems during product development.The innovations of this study are reflected as follows: ① Functional knowledge is extracted from patents as nodes and then combined with product patents and patent applicants as edges to construct a multiedge hypergraph network; ② the function node's weight is calculated using the applicant edge, and the patent edge's weight is derived by combining the number of patent citations and the number of patent families; ③ a multifunctional combination weight calculation model is proposed and combined with the network subgraph algorithm to obtain the optimal function combination.
The remainder of this paper is organized as follows: The theoretical background and related literature review are presented in Section 2. In Section 3, a research framework for a patent-driven product functional configuration (PFC) is proposed.A weighted function hypergraph (FH) is described in Section 4, followed by an empirical analysis that verifies the validity of the proposed method.Finally, the conclusions are presented.

Literature Review
As a portfolio innovation, the PFC refers to the convergence of different technical features and solutions that can be distinguished from existing products [6].This convergence is realized by combining elements with the original product [7].However, the PFC is not an arbitrary superposition of elements; as such, an accurate and comprehensive analysis is necessitated [8].Therefore, current research pertaining to technology combination primarily focuses on three aspects: ① patent data analysis, ② technology opportunity analysis, ③ technology convergence analysis.

Patent Data Analysis
In patent mining, valuable data are extracted from both structured and unstructured patents.Structured data primarily include the application date, citation relationship, and classification number.Some methods attempt to extract technological information from data.In a previous study, Daim et al. [9] analyzed a technology development trend based on patent numbers and then identified core technologies for a firm's R&D.The results showed that the number of patent applications for emergency technologies increased faster than those for other technologies.Meanwhile, other scholars accumulated patent citation data and created a potential technology reorganization list [10].A patent citation is a document cited by applicants or patent office examiners, and its content is associated with other patent applications.In general, more citations the higher quality and intrinsic technical value of the patent.Meanwhile, after a patent has been applied, the number of citations associated with the patent will continue to increase, and its value will increase as well.In addition to citations, patent classification numbers are typically used for technical analyses [11].The patent classification scheme is a coding system that classifies inventions in a technical field [12].The typically used classification numbers are the International Patent Classification (IPC) and Cooperative Patent Classification (CPC).Compared to the IPC, which comprises five levels, the CPC comprises six or seven levels and can provide more detailed technological information.Therefore, more scholars tend to apply CPC numbers instead of IPC numbers to identify technological opportunities [13].
Unstructured patent information is primarily composed of textual data, which are an important data source to current studies that provide abundant and detailed information.For instance, Blake and Ayyagari [14] obtained market hotspot information via the trend analysis of text themes in patents.Zhang and Yu [15] investigated technical topic extensions using a keyword analysis algorithm.In addition to keywords, some scholars used the relationships between words to identify design opportunities.Choi et al. [16] integrated dependency syntax and part-of-speech filtering methods to obtain subject, action, and object vocabulary, and then obtained word phrases related to technological opportunities.Kwon et al. [17] identified the unintended consequences of emerging technologies by mining underlying semantics from patent texts.

Technology Opportunity Analysis (TOA)
TOA is a method of innovation monitoring based on bibliometric analysis and data mining.Arguably, TOA has become more important owing to the increase in the uncertainty and risk of product development.By monitoring the technological development of enterprises, Hou and Yang [18] identified valuable patents that were overlooked for a significant period as data sources for identifying design ideas.In addition, scholars have investigated the formation of patent jungle communities as a technological opportunity [19].For instance, Jin et al. [20] used a technical efficiency matrix to identify vacuum technology, which has not yet been considered as a technology expansion objective.Li et al. [21] observed that goalkeeper patents are vital to the transfer of scientific theory to industrial applications.
Other researchers have attempted to identify technological opportunities in patents through text mining.Wang et al. [22] analyzed text topic development trends to identify topics associated with technological convergence.Yun and Geum [23] used the latent Dirichlet allocation algorithm to extract technical topics from patents.Kim et al. [24] monitored the development path through patent semantic similarity at different application times and provided a technology prediction reference.Li et al. [25] combined TRIZ theory and natural language processing technology to evaluate patent creativity and identified high-impact patents.Sheu and Yen [26] extracted information regarding harmful resources from patents to reduce risks associated with R&D.

Patent Data Visualization
Data visualization is crucial for understanding the results of patent analysis.Currently, patent data visualization methods are primarily classified into three categories: two-dimensional maps, incidence matrices, and network graphs.
A two-dimensional map is used to segment multidimensional information into two dimensions to facilitate visualization. Lee et al. [27] attempted to reduce the amount of patent data through principal component analysis to construct a technology map and then identify technology from blank areas.Lee et al. [28] constructed a landscape map from patent information as a vector space model to present the configuration of technological components.Seo et al. [29] proposed a portfolio map method using two patent values for novelty indices as axes to investigate the patents of competing enterprises and then identify technological opportunities.
The incidence matrix is a logical matrix that shows the relationship between two classes of objects, including the morphology, design structure, and vector space matrices.Arciszewski [30] generated new schemes through literature mining, first mined technical keywords through patent text data, and then combined keywords through a morphological matrix to facilitate designers in conceiving new ideas.Feng et al. [31] calculated the correlation coefficient between technology and product using a correlation matrix and then identified technological development opportunities that are suitable for the current product.In addition, the design structure matrix is a typically used tool for analyzing the relationships between different objects.Zheng et al. [32] constructed a pairwise relationship matrix between themes, in which the matrix element is the number of co-occurring patents.The vector space model (VSM) is one of the most robust information-analysis methods developed hitherto.Jun et al.
[33] introduced a matrix mapping and K-medoids clustering method based on a vector matrix to predict missing technology more accurately.Lei et al. [34] proposed a patent analytics method based on a VSM to solve semantics and curse-of-dimensionality loss.
In recent years, an increasing number of scholars have adopted graphs to perform patent analysis.Compared to two-dimensional maps and the incidence matrix, network graphs provide a better visualization through nodes and edges, and they are applied to technology weights and clusters via degree measurement algorithms [35].Kim et al. [36] identified core technologies from the perspective of technological cross impacts using network graphs and association rule algorithms.Sung et al. [37] used expanding cell structure networks to analyze core technologies.Song et al. [38] demonstrated patent keywords through a core-peripheral network, as well as important technical keywords through gravity algorithms.Some studies were conducted using subgraphs formed from a subset of vertices of a graph and all the edges connecting pairs of vertices in the subsets.Lee et al. [39] used a subgraph unit based on the existing node analysis and applied a quadratic assignment problem algorithm to calculate the correlation between different subgraphs to analyze the technological integration.Lee et al. [40] adopted a frequent subgraph algorithm to analyze the correlation among network nodes and obtained the best technology combination by calculating the confidence and support.Sun et al. [41] formed different patent clusters using text mining technology and then weighted the overlap between different clusters to analyze the technological integration.
Many scholars have performed TOA using patent-datadriven methods.The deficiencies of the current study are as follows: • The objectives of previous studies focused primarily on technical opportunities and rarely involved the excavation of functional requirements.As such, good suggestions for functional market expansion are difficult to provide.• Two necessary procedures are overlooked in the current research: calculating the weights of convergent objects in the functional configuration and organizing the clusters formed after convergence.• The analysis tools used in current investigations typically assumes that objects exhibit a single relationship.However, multiple relationships exist in terms of the patent co-occurrence between patent functions and the applicant.These relationships can affect the identification and integration of functional opportunities.
In this context, a novel and efficient method must be developed to facilitate firms in detaching from market function opportunities and creating optical functional configurations based on patent data by addressing the 2H-W.

Research Framework
Firms often apply patents for the design schemes of multifunction products, particularly consumer products, to expand the patent protection scope and reduce patent fees [42].Thus, the functional configuration of existing products can be analyzed using patent data.In this study, a new patent-data-driven PFC method based on a hypergraph is proposed, and a framework based on this method is developed, as shown in Figure 1.This framework comprises four steps: patent data acquisition and mining, function hypergraph construction, functional configuration scheduling, and configuration analysis.
Step 1: Patent data acquisition and mining.R&D terms are used to retrieve industry patents based on customer demands and the industry life cycle.Patents are downloaded from the website to construct a database pertaining to local computers.In these patents, structured data, such as the number of citations, applicants, and application dates, are obtained via paragraph cutting.Unstructured data, such as text data, must be cleaned by removing noisy information such as numbers, symbols, and auxiliary vocabularies.Step 2: Function hypergraph construction.First, multipart text mining based on the term frequency-inverse document frequency (TF-IDF) algorithm (MPTM-TFIDF) is used to weigh the words.Subsequently, keyword phrases are extracted from the patent text as patent function labels based on regular expressions.Words that compose a phrase should appear in the same sentence simultaneously, such as the phrase "cold water, " which is composed of both "cold" and "water" in the same sentence.Subsequently, a label set with different functions is formed.In addition, an adjacency matrix is applied to describe the relationship between the function and the patent or applicant.Finally, the patent function hypergraph model is constructed based on the matrix.
Step 3: Functional configuration scheduling.The applicant edge is used to weigh the function node.The citation number and patent family size are integrated to the weigh patent edges.A comprehensive calculation model is constructed for the weight function community, and an improved frequent subgraph algorithm (IFSA) is proposed to identify optical function combinations in the hypergraph network.
Step 4: Functional configuration recommendation.Based on the existing product functions or customer requirements, the target functions are obtained in Step 3. Finally, the accuracy of the configuration results is verified through market analysis.

Keyword Extraction Based on MPTM-TFIDF
TF-IDF is a statistical measure algorithm that evaluates the importance of a word to a document in a collection [43].TF-IDF is expressed mathematically in Eq. ( 1).In the keyword extraction process, the TF-IDF algorithm is typically used to detach value words that are (1) (2) rarely shown in documents but are essential [44].However, the effect of the algorithm is determined by the text data volume and synonyms [45].In practice, product function words are distributed in different sections of the patent, such as the title, abstract, claim, and technical background, and the amount of text data in different sections varies significantly.Moreover, many synonyms exist for the function keywords in the patents.All of the abovementioned factors affect the accuracy of the algorithm.
Hence, the MPTM-TFIDF method is proposed herein.First, to ensure high keyword extraction accuracy, the critical information extracted from all patent titles as text is significantly less than that from other sections.Subsequently, keywords with higher TF-IDF weights in different patents are obtained, and synonyms with the same meanings and high similarity are merged.Notably, the similarity is calculated using WordNet, which is a large English lexical database comprising 155287 words and 117659 synonyms (it can be downloaded from the website https:// wordn et.princ eton.edu).The semantic distance information of words is recorded in the database and can be extracted using natural language toolkit to calculate the similarity between words and then used to mine synonyms based on an empirical threshold.This method has been described in many papers [46][47][48] and thus will not be further explained herein.Finally, through the set elements, similar words in the abstract and the technical background of all patents are searched to determine the patent's functions via regular expressions.

Hypergraph Model Construction
In an ordinary graph, one edge precisely connects two vertices that denote a one-to-one relationship [49].The structure is concise but limited in expressing the relationships between multiple vertices [50].By contrast, the hyperedge in the hypergraph links the number of nodes, and hyperedges in the same networks can exist simultaneously.Therefore, a hypergraph was selected for this study.
is a finite set of nodes known as vertices, and E = {e 1 ,e 2 ,…,e m } is an indexed family of sets known as hyperedges, in which e i ∈ V.The degree of a vertex is the number of hyperedges to which it belongs, i.e., d(v) = | {e:v ∈ e}|, and the size of a hyperedge is its cardinality node, i.e., |e i | = k(1 ≤ k ≤ n).A hypergraph with hyperedges of size k is known as a k-uniform hypergraph, whereas a 2-uniform hypergraph is known as an ordinary graph [51].Figure 2 shows an example of three types of graphs.
The hypergraph can be illustrated as an incidence matrix |V|×|E| with element h(v,e), whose value is defined as shown in Eq. ( 4).
In addition, Figure 3 illustrates the relationship between the incidence matrix and hypergraph.

PFC Model Construction
The PFC involves not only products and firms, but also complex multi-entity and multilateral relationships.When the patent-driven functional configuration method is adopted, these two relationships are transformed into patent and applicant relationships to form a hypergraph model of product functions.Because the FH has more than one hyperedge, it can be illustrated using incidence matrices with hyperedges and nodes.The value of the matrix elements can be calculated using Eq.(4).

Hypergraph Weight Calculation
The calculation for the hypergraph weight includes those for the function node weight and patent hyperedge weight.

Function Node Weight Calculation Based on Patent Applicant Hyperedge
For a product to be a leader in the market, it must satisfy customer requirements continuously.In this regard, important functions must be integrated to increase market attractiveness.Currently, the definitions of essential functionality are scarce.Based on the definition of technological opportunities [52], an important function can be defined as follows: Definition 3. Important functions refer to those that are widely and promptly accepted by the market.
From a market perspective, ensuring that a function is generally accepted by consumers is important.From the perspective of patents, important functions are widely used by respondents.The higher the involvement of enterprises in product development, the more critical the function becomes [53].The later the feature appears, the greater is the probability of it becoming popular.
Therefore, the weight of the function node w fi in the hypergraph is calculated based on the time index of the applicant's hyperedge and function, as shown in Eq. ( 5).
wt i denotes the time index of function f i , and ec i j denotes the hyperedges covering the node of function f i .The longer the function is available, the less popular it will be in the market.By contrast, new features are more likely to become popular in the market.Therefore, the interfunction index was calculated as shown in Eq. ( 6).
T n denotes the current year, and T if denotes the year when the product of function f i is first applied as a patent.Based on the formulas above, the following equation is derived to calculate the weight of function f i : (5)  Owing to the significant deviation in the number of patent applicants' hyperedges for different functional nodes, data normalization is necessary to adjust values from large to minor scales.Many types of statistical normalization methods exist, including standardized moment, coefficient of variation, standard score, and max-min normalization.Based on the patent data characteristics, max-min normalization is adopted such that all values are within the range [0,1], as shown in Eq. ( 8).
w fi ) and max(w fi ) denote the lowest and highest values of the weight range of all functional nodes, respectively.

Calculation of Patent Hyperedge Weight Based on Patent Quality
The functional distribution of the corresponding products can be extracted through product patent analysis.The patent hyperedge weight is calculated from the perspective of patent quality.According to the World Intellectual Property Organization [54], citation number and patent family size are two core indexes for calculating patent quality.The more frequently a patent is cited, the greater is its impact [55].When the size of the patent family increases, the number of countries filed for the patent as well as the economic value of the patent increase [56].Therefore, these two indexes are incorporated into the patent hyperedge weight calculation, as shown in Eq. ( 9).
w epi denotes the weight of the patent hyperedge ep i ; fep i denotes the number of patent families of patent ep i ; cep t i denotes the number of citations per year of patent ep i and reflects the degree to which the patent is valued by peers; ϕ denotes the weight ratio of fep i and cep t i .It is noteworthy that the company should apply a patent for the product as soon as it is developed.The earlier a patent is applied, the greater is the possibility for it to be cited.Consequently, the number of citations will become higher than that of subsequent patents [57].To eliminate the effect of patent application time, the number of patent citations per year in the entire life cycle is set as the weight calculation index.Hence, cep t i is is calculated as follows: (7) cep i represents the total number of citations of patent ep i , and T i ep represents the application year of patent ep i .Through the maximum and minimum standardization processes, the weight w epi of ep i can be calculated as follows:

Functional Configuration Based on Hypergraph
The PFC comprises two aspects: evaluation and acquisition of the functional community.

Evaluation of Function Community
To ensure the versatility of a product, a functional community is formed and reflected in an FH.Before the PFC is formed, the functional community must be evaluated comprehensively in advance and filtered through a hypergraph.In the PFC model, nodes representing functions are connected through patented hyperedges.The strength of the connection depended on the weight of the hyperedge.The higher the weight, the closer is the connection between nodes, such that some nodes form clusters or communities.The hyperedge weight is an indicator for evaluating the community.The importance of nodes is another indicator for evaluating communities.Node weight is positively correlated with the importance of the community.For example, the handheld function and rain function are closely related, and the weights of the two functions are high; therefore, the two functions can be easily integrated into the same product.
Suppose that function nodes and hyperedges form the same community subgraph FH o (F o ,ep o ), where The weight of the functional community w FH o is calculated using Eq.(12).Based on Eq. ( 12), the weight of one community is higher when it contains more functions and patents.This is because a product can satisfy various individual requirements when it contains many functions, which (11) is welcomed by customers.Meanwhile, the more products with similar functional combinations, the more critical the functional community becomes.

Function Community Acquisition Based on Frequent Subgraphs
An FH contains many subgraphs, each representing a functioning community.To obtain the optimal combination of functions, a frequent subgraph algorithm is introduced to select the optimal function community.Currently, two frequent subgraph mining algorithms are typically used: Apriori and FP-Growth.Compared to the FP-Growth algorithm, the Apriori algorithm is more mature and widely used [58].Therefore, Apriori was adopted in this study for FH subgraph mining.
The Apriori algorithm is generally used to screen subgraphs.However, existing studies that obtain the optimal subgraph based on the Apriori algorithm disregards the weight of the subgraph, and the results are inaccurate.Therefore, an IFSA is proposed herein.This algorithm uses the weights of functional communities as the basis for subgraph screening.w H is the sum of weights covering all patent communities, and it is calculated as follows: The algorithm is described as follows: (13) Through the RFSA, the optimal number of function combinations k is obtained, and groups with better weights from the same number of function combinations are explored.The algorithm provides a reference for designers to determine the function quantity and optimal function combinations.

Case Study
Fulfilling the demands of every individual customer for bathroom furniture and accessories is a challenging task, particularly for showerheads.Many enterprises aim to develop fashionable and attractive products.This section presents a case study of the proposed method.The algorithms were encoded and executed using Python  software.Patent data were retrieved and downloaded from Pantsanp (https:// www.patsn ap.com/), which is a well-known commercial patent database.Currently, showerheads with rain functions are primarily manufactured by one firm.This product lacks competitiveness as its functions are scarce.Therefore, the firm intends to develop a new multifunction shower and has commissioned us to aid in patent analysis to detach new function opportunities from the market and then arrange the functional configuration.Initially, we used the keywords and CPC numbers to search for patents applied in the USPTO.The search formula used was as follows: Title or Abstract:(showerhead* or shower head* or sprayer*) AND CPC:(B05B1/18) AND Time:(from 19140101 to 20200101).A total of 1358 patents were obtained from the USPTO database (as listed in Table 1).
The TF-IDF algorithm is used to calculate the weight of the vocabulary in patent titles, and the results are shown in Table 1.Words with high values are often associated with product functions.Word similarity is calculated using the WordNet database, and synonyms with a threshold exceeding 0.1 are merged into function keywords, as shown in Table 2.
The functions in Table 2 were used to label patents with regular expressions, and the results are listed in Table 3. Table 4    To further verify the effectiveness of this method, we compared our results with three typical keyword extraction algorithms, i.e., TF, MPTM-TF, and TF-IDF, based on the precision (P), recall (R), and F-value (F), as shown in Eqs. ( 13)- (15).TP, FP, and FN donate the numbers of true positive, false positive, and false negative instances, respectively.Based on these counts, 10 patents containing more than 600 words were randomly selected as test objects, and experts were recruited to verify the effect; the results are shown in Figure 4. Compared to other algorithms, MPTM-TFIDF yielded significantly better P, R, and F values.
Applicant hyperedges were used to calculate the weight of the node.First, the current year was set at 2021.The number of patent applicants for all functions was calculated, and the min-max normalization (15)     algorithm was applied to obtain the weight in the range (0,1).The calculation results are listed in Table 4.For a more concise visualization of the graph, the hypergraph is shown using the Python hypergraph tool (see Figure 5), where the patent hyperedge is labelled as "ep, " and the applicant is labelled as "ec." The functions in Table 3 are used as the nodes, and both the patents and applicants in Table 1 are used as the hyperedges.To distinguish between different functions, functions with higher weights are represented by nodes with a larger radius.
Because the importance of the patent citation number ep and patent family size fep is equal, the weight ratio of the two indicators ϕ is set to 0.5 after a discus- sion among the experts.Based on the patent data, the weight w ep of the patent hyperedges is calculated using cep and fep (as listed in Table 5).The value range of the patent family is (1,115) and the patent citation is (0, 245), as shown in Table 5.The max and min values are counted in the max-min normalization.
Although the total number of patents is 1354, the number of communities is 753 when patents with the same function are merged into one community.The weights of the functional communities are identified using Eq. ( 12) and are listed in Table 6.Clearly, many MFPs are more popular than the products with fewer functions or a single function in the market.This indicates that the MFP is well-received by the market.
To verify the results obtained using our method, the number of products on an e-commerce website was counted.Currently, more than 7000 shower products are listed on Amazon (https:// www.amazon.com).Because 753 function communities exist, based on an analysis of shower patents, each community has fewer than 130 functions.However, the functional communities are primarily identified in more than 200 products (Table 7).To further verify the effectiveness of the method, six functional communities with lower weights are listed.Table 7 shows that the quantities of these products are significantly lower than the average.This implies that MFP designs are more popular in the market.

Conclusions
A patent-data-driven method based on a hypergraph network was proposed herein to solve the 2H-2 problem in the MFP design process.In addition, NLP and association-rule algorithms were applied.The contributions of this study are summarized as follows: (1) In this study, the MPTM-TFIDF algorithm was used to extract functional keywords from patent title text; subsequently, these keywords were used to retrieve patent full-text data to label each patent with function keywords.This method can accurately mine most functional data.
(2) An FH was constructed, in which patents or applicants represent the function and edges represent nodes.The applicants calculated the weight of a node, and the weight of the patent edge was calculated based on the number of citations and families.In addition, a community weight calculation model for the function nodes was proposed.
(3) Based on the improved Apriori algorithm, an IFSA algorithm suitable for a weighted hypergraph network was proposed.By calculating and comparing the weights of functional communities to determine the optimal functional combinations, this algorithm can promptly provide market opportunities for product design.
Finally, the method proposed herein was applied to the design of shower products and then verified using e-commerce data.In fact, a PFC must consider many factors, such as fashion, regulations, policies, and incentives.Therefore, patent data alone are insufficient for product design, and other types of data are required.

Figure 1
Figure 1 Patent functional configuration framework based on hypergraph w T D denotes the weight of term T in document D; TF(T,D) denotes the percentage of term T in document D; IDF(T) measures the rareness of term T that occurs across document D; f(T,D) denotes the frequency of term T in document D; s(D) denotes the number of terms in document D; s(N) denotes the number of all documents; c(T,N) denotes the number of documents that contain the term T.

Figure 2 Figure 3
Figure 2 Three types of graphs w ep o i denotes the weight of the patent hyperedge ep o i ; w f o i denotes the weight of function node f o i ; N F o donates the number of function communities F o .

Definition 4 .
IFSA.For a hypergraph H and a minimum comprehensive weight τ , S up (H,FH o ) represents the weight of the subgraph FH o in H; when S up (H,FH o ) ≥ τ , FH o is a fre- quent subgraph of H. S up (H,FH o ) is calculated as follows:

Table 1
Information of patents

Table 2
List of functions indicates that patents can have multiple *Multiple arbitrary characters

Table 3
Function labels of patents

Table 4
Weight of function nodes in hypergraph

Table 4
(continued) Figure 5 Hypergraph with edges and nodes

Table 5
Weights of patent edges in hypergraph

Table 6
Weight of function community Note: To save space, only part of the evaluation data is provided