3.1 Data
We use register-based employer-employee data provided by Statistics Sweden, where we can match individuals to the respective establishments where they are employed. The data contain information on the labor market status of individuals at a yearly frequency where the employment status is specified in November of each year. Based on this employment status, we can differentiate whether an individual is an employee or self-employed. Self-employment is defined as business ownership, and an individual is reported as self-employed if at least half of her income originates from a business she owns.
7
Similar to Hensvik and Skans (
2016), we exclude establishments with more than 500 employees throughout the period considered in this paper. Using this threshold decreases the computational burden, and the likelihood of mis-specifying links as true coworker ties via common working histories is greater in larger organizations.
8 New hires between any two establishments that exceed 5 employees are excluded from our analysis to avoid including possible mergers and acquisitions following, e.g., Hensvik and Skans (
2016). We exclude the agriculture, forestry, and mining industries from our data throughout the period and include only private sector employment. This is done because wages are not comparable across the public and private sectors (Wahlberg,
2010).
The included new hires are those made between 2010 and 2014. Using these years renders that we have all new hires who exit self-employment or change between two establishments as employees within a window of 4 consecutive years to increase the sample size and the generalizability of the results. The data we use in our empirical estimations are therefore cross-sectional. However, we have information on the employment histories of individuals dating back to 1993, which we use when measuring our coworker networks.
3.2 The network
We are interested in two different types of new hires who at year
t are both fully employed
9: those whose employment status at
t-1 was self-employed or those who were wage employed at
t-1. All the new hires must also change their unique establishment identifiers between years
t-1 and
t. This restriction, together with excluding labor mobility between two establishments of 5 or more individuals, enables us to exclude all self-employment acquisitions and mergers.
10 As described above, we consider direct transitions for both groups of new hires, meaning that the individual has information on employment status and establishment identifiers at times
t-1 and
t. We include only new hires with no prior ties to the hiring organization to ensure that we do not capture pure firm selection.
The incumbent workers, i.e., those who refer the new hire, are defined as workers observed at the establishment for at least two consecutive years. Therefore, they are observed at the establishment at least one full year prior to the hiring of the new employee. We construct the network based on similar employment histories between the new hire and the incumbent worker from common previous labor market experience (in terms of establishment and year(s)) before the new hire joins the new workplace. The establishment they are hired into must be different from the establishment where they originally formed the link.
Our data span from 1993 onward, and we allow the coworker ties to be formed at any point between 1993 and 2 years before the individuals are hired. We construct yearly matched pairs of the new hire (i) and incumbent worker (j). For each new hire-incumbent pair, we define a variable indicating whether (j) and (i) worked at the same establishment at the same time and are now employed by the same establishment. This leads us to obtain a dichotomous variable indicating whether the new hire has an existing coworker link in the new workplace.
The new hire and the incumbent worker must have worked together at the same establishment for at least 1 year before employment in year
t. However, it should be noted that we make a significant assumption by concluding that the two employees know each other simply by their prior employment history. We provide various robustness tests to rule out the most likely coworker ties that we misidentify as being referred due to factors such as the large employer effect. The results are also robust when we define the networks within skills levels, i.e., only employees in higher-skilled occupations form links to each other.
11
3.3 Empirical model and descriptive statistics
Our primary purpose is to examine whether new hires who exit self-employment and have existing coworker ties receive higher entry wages and whether the existence of a coworker tie has similar implications for the self-employed than for those who change employers. The wage equation we estimate for all the new hires is defined as:
$${w}_{i}=\alpha +{\tau }_{1}{E}_{i}+{\tau }_{2}{Link}_{i}+{\tau }_{3}({Link}_{i}*{E}_{i})+{\varvec{X}}\gamma +{e}_{f}+{e}_{o}+{e}_{d}+{e}_{m}+{e}_{t}+{\varepsilon }_{i}$$
(1)
where
\({w}_{i}\) is the natural logarithm of entry wages of the new hire (
i). The entry wages are measured at a yearly level and presented in Swedish Krona using 2016 values. We are specifically interested in the estimated
\(\tau\) coefficients. The variable
\({E}_{i}\) denotes when the individual is formerly self-employed, i.e., the new hire was self-employed at time
t-1. The
\({\tau }_{1}\) term estimates how differently, in general, the individuals who exit self-employment earn relative to job changers. The variable
\({Link}_{i}\) takes a value of 1 if the new hire has existing coworker ties in the new workplace and 0 if she does not. Therefore, the
\({\tau }_{2}\) term is the estimated increase in entry wages associated with having coworker ties. We include the interaction term indicating whether the individual comes from self-employment and has coworker ties, denoted as
\({\tau }_{3}\). This term, therefore, answers the question of whether the ties of exiting self-employed workers differ from those of wage employees. The estimated difference between a formerly self-employed worker with a coworker link and job switchers without a link can be calculated by summating all estimated
\(\tau\) terms.
In the vector of control variables (
\({\varvec{X}})\), we include individual and establishment characteristics that impact wage-setting following the Mincerian wage equation (Mincer
1958,
1974). Specifically, we include the labor market experience of individuals measured separately for the years of employee and self-employment experience (
Experience and
Self-employment Experience). As we measure coworker ties based on previous work history, these experience measures are additionally important as they control for the possibility of having formed ties at the same time while accounting for overall employment experience. These two experience variables are measured in accumulated years of respective experience starting from 1993. We also include the years of schooling, which is based on the highest degree obtained (
Schooling), the age of the individual (
Age) and the squared term (
Age2),
12 the gender of the individual (
Gender), whether the individual is married (
Married), whether the individual has children living at home (
Children), and whether the individual was born outside from Sweden (
Foreign-born). In addition, to control for workplace characteristics in the entry wage determination, we include the size of the establishment based on the total number of employees in logarithmic form (
Establishment size), the establishment age measured in years since start-up (
Establishment age), and whether the establishment belongs to a multi-establishment firm (
Multi-establishment). Table
6 in the Appendix provides a correlation table of the independent variables included in this analysis.
Importantly, in Eq.
1, we control for establishment
f, occupation
o, industry
d, labor market
m, and year
t fixed effects. The
\({\varepsilon }_{i}\) term is the error term that is clustered at the establishment level. The occupational data follow the Swedish Standard for Classification of Occupations (SSYK), which corresponds to international standards (ISCO-88). We control for occupations at the 2-digit level. The industry classifications follow the Swedish Standard Industrial Classification codes, which are based on the EU’s recommended standards (NACE codes). They are reported at the establishment level, and we control the industry-specific wage determinations at the 2-digit level. The labor markets are based on individual residences and comprise of 60 local labor markets across Sweden. They are constructed based on commuting patterns and existing municipality borders.
We estimate Eq.
1 with an ordinary least squares (OLS) estimation. Our main identification assumption lies in the ability to control for as many observable characteristics of individuals and firms as possible within the detailed register-level data while also controlling for the large set of fixed effects. However, it is well-known that the self-employed and employees are not directly comparable, which has led previous research to apply matching methods to make the two groups of individuals comparable (Kaiser & Malchow-Møller,
2011; Mahieu et al.,
2019; Manso,
2016). We use the coarsened exact matching (CEM) matching estimator (Iacus et al.,
2012) which allows the balance between the two groups to be chosen ex-ante.
13 We match the exiting self-employed workers and job-switchers the year before they are new hires, i.e., at time
t-1 when they are preparing to leave their prior employment. We match the two groups of individuals based on whether they are foreign-born, their age, gender, labor, and business income, and the firm’s productivity defined as value-added. Controlling for the baseline differences, especially the income and performance of the firm, can be important, as these are likely to drive the difference in wage negotiations. Detailed information about the covariate threshold values used in employee matching and the overall matching summary is provided in Appendix Table
7.
The matching aims to control any co-founding differences between the self-employed and the employees. However, the ties are identified only when individuals gain full-time employment, which the matching does not account for. This definition of the ties means that we are unable to measure any choice set individuals have based on potential new employers and their potential referrals in each firm, i.e., we are capturing only ties that are conditional on having gained employment. One possible remedy for overcoming such an issue would be to use surveys, i.e., obtain more qualitative data on the search process.
14 Using population-wide register data, such as we are using, has the advantage of resulting in large and representative samples where one can track individuals across time. The disadvantage comes from, for example, the inability to trace counterfactuals for individuals’ possible choices as one has information on a yearly level only on the registered outcomes. Therefore, we are able only to measure coworker links for individuals who gained employment, and thus the results of having the links should be considered only for a similar type of individuals who also gained full-time employment.
15
Even if we had access to richer data on coworker ties, the ties are not randomly allocated across individuals, which means that the \({\tau }_{2}\) term is endogenous. In the absence of exogenous variations in the coworker ties, our estimations should not be interpreted strictly as causal. Our aim and contribution originate from being the first to map and find a relationship between the usage of coworker ties and the entry wages for the exiting self-employed while controlling for a large set of individuals- and firm-level characteristics.
Table
1 describes the data used for our estimation sample. The mean values for those new hires from self-employment (
From self-employment) and those changing jobs (
Job changers) are presented separately. We also show the mean values separately for those with and without coworker links. Table
8 in the Appendix provides a complete set of descriptive statistics.
Table 1
Descriptive statistics for new hires
Individual-level data |
Yearly wages (in SEK) | 427,366 | 366,228 | 425,631 | 368,810 |
Experience (in years) | 10.41 | 7.649 | 13.75 | 11.12 |
Self-employment experience (in years) | 4.504 | 5.463 | 0.352 | 0.584 |
Schooling (in years) | 12.69 | 12.68 | 12.22 | 12.32 |
Age (in years) | 44.03 | 42.83 | 44.50 | 42.28 |
Gender (1 = man, 0 = otherwise) | 0.793 | 0.747 | 0.684 | 0.638 |
Married (1 = married, 0 = otherwise) | 0.551 | 0.497 | 0.477 | 0.428 |
Children (1 = children, 0 = otherwise) | 0.651 | 0.594 | 0.585 | 0.550 |
Foreign-born (1 = foreign-born, 0 = otherwise) | 0.217 | 0.239 | 0.235 | 0.281 |
Establishment size (number of employees) | 76.51 | 44.25 | 78.50 | 46.18 |
Establishment age | 12.52 | 10.77 | 13.93 | 11.26 |
Multi-establishment | 0.336 | 0.268 | 0.390 | 0.325 |
Network characteristics |
Number of Links | 3.078 | | 3.177 | |
Years since the link was established | 7.763 | | 5.524 | |
Individuals | 2.718 | 22,821 | 84,761 | 431,327 |
Overall, approximately 10.6% of exiting self-employed workers and 15.7% of former employees have coworker links. The latter finding is in line with Hensvik and Skans (
2016). Those who are hired from employment have coworker links more often, which could be because the self-employed did not have an opportunity to form coworker links in their prior role. This result would descriptively suggest that the formerly self-employed incur an experience cost via the lost opportunity to form coworker ties. This is further supported by the fact that the links are on average around 2 years older for the self-employed compared to the job changers. Otherwise, the number of coworker ties is similar across the two groups of new hires with a mean of 3 coworker links. However, it should be noted that the median value is 1 link per new hire, which implies that the number of coworker ties is likely to be skewed, as seen in Table
8 in the Appendix.
The formerly self-employed and job changers are similar across individual-level characteristics, which indicates that our matching effectively makes the two groups of individuals similar in observable characteristics. However, the self-employed seem to select themselves into slightly smaller and younger single-establishment firms. This highlights the importance of controlling for not only firm-level characteristics but also firm-fixed effects, i.e., all unobservable firm characteristics. Controlling these characteristics accounts for this selection of individuals to firms.