1 Introduction

A properly functioning judiciary is an essential cog in the wheel of the market economy, in particular its ability to resolve cases in timely fashion. This statement is if anything even more apposite for transition economies, which may struggle in particular to enforce black letter rules through the court channel (Pistor et al. 2000). The existing survey-based evidence (see, e.g. World Bank 2018, CEPEJ 2018) together with the selected state court data approach (Murrell 2001; Dimitrova-Grajzl et al. 2012, 2016) suggests that post-Communist countries have yet to succeed in establishing an efficient judiciary. This is a red flag indicator for trade and commerce, as the available literature shows a clear link between an inefficient judiciary and adverse effects for the national economy. Jappelli et al. (2005) presented evidence showing that it is significantly easier for a private business to obtain a loan in the regions in Italy where courts operate effectively. In addition, other studies (e.g. García-Posada and Mora-Sanguinetti 2015; Giacomelli and Menon 2013; Rajan et al. 2001) show that efficient courts have a positive impact on average firm size in a country, as they reduce transactional costs which in turn may be perceived as a barrier to corporate growth. The evident negative social consequences of an ineffective judiciary as well as the scarce rigorous court data literature prompted us to delve into this topic with regard to Poland.

Research on the determinants of court performance has a long history. A key moment came in the 1950s with a groundbreaking study on the economic causes of backlog in the New York court system (Zeisel et al. 1959). Since then, economic analyses have developed markedly. In the late 1970s/early 1980s simple, one-dimensional techniques (focusing on single variables such as average length of court proceedings) were supplemented by multidimensional approaches that analysed the impact of various factors affecting court efficiency. These methods take a frontier or an average approach. The former can be divided into non-parametric and parametric methods. Nevertheless, it was undoubtedly the nonparametric models that forged ahead, inclining the economic literature toward parametric single- and multi-equation models.

Our contribution to the existing literature is threefold. Firstly, we concentrate our research on one post-Communist country (Poland) which went through a successful transition from central planning to the market economy (Balcerowicz 2005). The Polish court system, in contrast, is perceived as slow and inefficient by the public (Kociołowicz-Wiśniewska and Pilitowski 2017), a view which is confirmed by comparative studies (CEPEJ 2018; European Commission 2018). Moreover, there is a general paucity of research on the efficiency—or otherwise—of the post-Communist judiciary in comparison to advanced economies (e.g. Priest 1989; Kittelsen and Førsund 1992; Pedraja-Chaparro and Salinas-Jimenez 1996; Buscaglia and Ulen 1997; Rosales-López 2008; Lindquist et al. 2007; Mitsopoulos and Pelagidis 2007; DiVita 2012; Christensen and Szmer 2012; Santos and Amado 2014; Castro and Guccio 2014; Falavigna et al. 2015; Espasa and Esteller-Moré 2015).

Secondly, we simultaneously apply two different econometric methods: standard panel data models as well as one frontier model, namely stochastic frontier analysis (SFA). The latter appears to be the scarcest in the literature (Castro 2009; Antonucci et al. 2014; Espasa and Esteller-Moré 2015).

Thirdly, contrary to previous research (Beenstock and Haitovsky 2004; Dimitrova-Grajzl et al. 2012, 2016) we notice that judges do matter for court performance in some selected kind of cases. This finding relates in our opinion to the degree of aggregation of data on cases, since it appears essential to distinguish between different kinds of cases if we wish to make a proper assessment of the role judges play in determining court output. To the best of our knowledge it is one of the very few pieces of research which differentiates between cases in this vein and the first to estimate the production function for distinct case types. It is important with regard to the Polish judiciary at least, whereas specialised or auxiliary court staff can resolve non-trial cases within the court system, they can only provide support in trial cases, which must be resolved by a judge.

The paper is organised as follows. In Sect. 2 we provide a brief overview of literature covering determinants of court performance and court efficiency. In Sect. 3 we discuss in detail available data as well as the commercial court system in Poland. Section 4 presents our empirical data and results. We conclude in Sect. 5.

2 Determinants of court performance and court efficiency: literature review

The topic of court performance and judicial efficiency has received considerable attention in the literature and can be divided into two groups based on the depth of analysis. The aim of the first group is principally to compare different courts in terms of performance and to identify the most and the least efficient. This branch of literature makes use of either performance indicators or Data Envelopment Analysis (DEA) to assess how efficiently courts transform their inputs (e.g. caseload, staff, technical equipment) into outputs (number of cases adjudicated). For instance, the DEA approach was used in a pioneer paper of Lewin et al. (1982) that analysed criminal courts in North Carolina and by as Kittelsen and Førsund (1992) with regard to Norway. The DEA methodology was also employed in more recent literature, such as El-Bialy and García-Rubio (2011) (Egyptian courts), Yeung and Azevedo (2011) (Brazilian courts) and Santos and Amado (2014) (Portuguese courts). While this approach provides some useful data on court performance and can be used to compare performance, it does not enable researchers to strictly determine the factors that affect output and efficiency.

For the reasons stated above, other methods are used in the research to investigate factors that affect court performance, e.g. two-step DEA-OLS analysis, panel least squares regression and stochastic frontier analysis. Due to their parametrical nature, these methodologies precisely determine the impact of each factor on the performance/efficiency of a court. They are also more robust as regards outlying observations. For these reasons we employ parametrical methods in our study.

Consequently, the available literature in the second group presents more in-depth conclusions on why courts work efficiently or not. The literature presents us with an immediate paradox of the first degree: the number of judges in itself has either a limited impact on court output (Dimitrova-Grajzl et al. 2016) or no impact at all (Beenstock and Haitovsky 2004; Dimitrova-Grajzl et al. 2012). These papers found that judges tend to manage their work in a way that allows them to deal with almost all the cases they receive—their productivity increases when they are flooded with cases. Hence, if you wish to increase the output of the courts, merely increasing the number of judges is not the simple solution that ill-informed logic might suggest. In other words, courts are demand-driven, as they adjust to the demand for justice. Moreover, research shows that court performance can be affected by other factors such as political, institutional and socio-economical covariates such as regulation of judicial careers and proceedings in the area of jurisdiction (Castro 2009), strategy of collecting evidence in criminal cases and court budgets, which can increase efficiency (Antonucci et al. 2014). On the other hand, temporary employment of judges and high staff turnover generally may have a detrimental effect on court performance (Rosales-López 2008; Espasa and Esteller-Moré 2015).

In addition to the literature presented above, reference should be made to a paper that used country-level data to analyze factors shaping the efficiency of legal systems in the 47 countries represented in the Council of Europe (Voigt and El-Bialy 2016). This research shows that court performance does not depend on the country’s level of economic development (measured by GDP per capita). Moreover, high court budgets and independent judicial councils have a negative impact on performance, whereas mandatory training for judges increases resolution rates.

When discussing the second group of theories we should be conscious of the issue of endogeneity of variables. For example, efficient courts i.e. those that are able to quickly resolve logged cases, may attract more newly-filed cases than less efficient courts (Buscaglia and Ulen 1997; Dimitrova-Grajzl et al. 2012, 2016). Beenstock and Haitovsky (2004) and Dimitrova-Grajzl et al. (2012) point out that the appointment of judges can be a response to delays generated by a given court. The endogeneity problem can be addressed through instrumental variables, which we will expound on later in this paper. The aim of our study is to contribute to the existing literature by, first, including case heterogeneity and, second, providing analyzing regional judicial performance and its determinants in a large transition country, i.e. Poland.

3 The court system in Poland and the database

3.1 The commercial court system in Poland

The foundations of the justice system in Poland are embedded in the 1997 Constitution of the Republic of Poland. The general court system is comprised of common courts divided into specialised divisions, e.g. commercial courts.Footnote 1 The administration of justice consists of three instances with the Supreme Court at the top of the ladder. The Supreme Court performs an ultimate function in the common judiciary, leading, inter alia, institutional and extra-institutional supervision. The first one boils down to recognition of cassation complaints and other remedies, and the second consists in adopting resolutions that resolve legal issues in the event of divergent interpretation in case law. Supervision over administrative activities of common courts, excluding the Supreme Court, is exercised by the Minister of Justice.

There are 317 district courts, 45 regional courts and 11 appeal courts. The district courts maintain two divisions (criminal, civil) mandatorily whereas commercial divisions are established by decision of the Minister of Justice, determining the borders of their geographical jurisdiction. At the time of writing there are 66 commercial divisions (commonly called commercial courts) within district courts located throughout Poland. Their main task is to resolve commercial cases.

The inflow of commercial cases relates to their commercial nature and the amount in controversy. The former refers to the claimant and defendant, which must be businesses as the commercial courts of Poland only deal with cases filed by so-called ‘entrepreneurs’ (if one of the parties does not fulfil the entrepreneur definition, the case falls into the civil division of the common court). There is also a monetary threshold: disputes over property rights worth less than PLN 75,000 (c. €17,400) are examined at the first instance by the commercial divisions of the district court; the commercial divisions of regional courts deal with cases above this threshold. Parties may appeal a judgment or order in the first instance through an appeal or complaint which is recognised by the higher instance (for a District Court decision it is the appeal department of the Regional Court, and for the first instance decisions of the latter—the Court of Appeal).

Judges are appointed by a competitive open procedure. They are granted tenure from the start of their career, but could be promoted to a higher instance depending on various assessment criteria following an open public competition. Judges are allocated to the divisions of court by the president (judge) of a particular court. However, a judge may serve in more than one division and his/her time is then divided by order of the court president. Commercial judges are expected to have economic knowledge—a condition which is difficult to assess and depends on the subjective opinion of the president of the court. Commercial divisions are organised within offices which have auxiliary staff including in general: court clerk (urzędnik sądowy), judge assistant (asystent sądowy) and legal clerk (referendarz). The court clerk may execute general or specialised tasks. The assistant focuses on preparing written justifications of judgments. It is not uncommon for one judge assistant to be shared by two or more judges or for there to be no assistant at all. A legal clerk may to perform certain judicial activities. For instance, s/he can conduct some non-trial proceedings, in particular regarding payment orders. The composition of commercial court’s depends ultimately on the number of posts approved by the Ministry of Justice. The latter grants a permission to the president of court with sufficient budgetary resources allocated whether he/she may employ more or less specialized labour force to the commercial division.

3.2 Database

For the purposes of investigating the activity of district commercial courts in Poland we made use of a unique dataset provided by the Ministry of Justice, containing annual data on the number of logged, pending and resolved cases, serving judges and auxiliary staff members in the period 2009–2016.

The database enables us to distinguish between three categories of commercial cases adjudicated and resolved by type of court. The first category is of cases requiring full trial—denoted FT cases—and the second group consists of cases that are adjudicated on the basis of simplified procedures known as writ-of-payment [postępowanie upominawczo-nakazowe]—denoted WP cases WP proceedings by their nature require significantly less attention by serving judges and rely in particular on the work of auxiliary staff, in particular court clerks. As there are procedural differences between these three types of commercial cases, we analyse them separately. The third group consists of non-litigious commercial cases—denoted NL cases, i.e. cases where there are no rival parties but where the court is called upon to formally recognize a certain state of affairs. The data on the number of judges and court staff refers only to those who directly deal with commercial cases and are expressed in full-time equivalent jobs. In the years 2009–2016 the largest group of cases adjudicated by Polish first-instance commercial courts were cases involving writs of payment (77% of cases included in this study). Full trial cases accounted for 23% of cases, whereas the share of non-litigious cases was merely 0.07%. Our data did not allow us to measure the exact time judges spent on adjudicating cases of different types—for this reason we use the data on the number of judges in the entire commercial court.

In order to achieve data comparability we had to remove some courts from the dataset provided by the Ministry of Justice. Firstly, for 2011 data we excluded the courts for which it was their last year of existence, as the number of cases adjudicated by them was not consistent with other courts (they stopped accepting new cases and operated only to clear their caseload). Moreover, we excluded one district court in Lublin as it is an e-court that deals exclusively with electronic cases filed from the whole country and another which was affected by frequent reforms that changed its areas of responsibility. Our ultimate database used for the following quantitative analysis contains data for 63 commercial district courts in the period 2009–2010, 54 courts in the years 2011–2014 and 53 courts in the years 2015–2016 (the falling number of courts is attributed to judicial reforms).

Moreover, in the part of this study related to the determinants of judicial inefficiency we also included three variables proxying the economic situation in each geographic jurisdiction, namely: absolute income per capita, number of firms per 10,000 inhabitants and the share of privately-owned enterprises. As judicial districts in Poland do not overlap with counties (powiaty)—the base unit of the statistical territorial division of the country, we had to obtain the data for each territorial jurisdiction by statistical weighting. To do this we used population data from municipalities (gminy)—lower-level territorial units—to weight data from powiaty belonging to more than one territorial jurisdictions. This data was used to ascertain whether there is a relationship between regional economic standing and business demographics, on the one hand, and local court performance on the other.

The outcome variable measuring district court activity is the number of resolved cases. Therefore, the analysis deals with only one aspect of judicial performance, namely the ability of courts to resolve commercial cases. The choice of dependent variables was entirely determined by limited data availability in Poland and thus we were unable to investigate issues such as the quality of the judicial system. Empirical studies devoted to analysis of judicial performance have taken a similar approach (e.g. Beenstock and Haitovsky 2004; Dimitrova-Grajzl et al. 2012, 2016). Contrary to them, our research focuses exclusively on commercial cases and distinguishes between two major types of cases. This is of great importance for analysis and interpretation of our results compared with the findings of other studies, which could be affected by statistical bias owing to the lack of discrimination between different types of adjudicated cases.

We applied a set of explanatory variables to investigate judicial performance in Poland: judicial staffing, caseload and auxiliary staff members as likely determinants of court efficiency in resolving cases. Judicial staffing is measured by the number of serving judges (expressed in full-time positions) who handle commercial cases in district courts. The caseload is measured by the number of cases that have been filed with a court during a year and the number of pending cases that remained unresolved from previous years. The third group of explanatory variables contains ratios of three types of auxiliary staff members, namely: legal clerks, judge assistants and court clerks to judges. The primary role of them is to provide necessary help to judges rather than to adjudicate on their own. Hence, they are expressed as ratios to number of judges and not as a typical court inputs.

Detailed definitions of dependent and explanatory variables as well as the summary statistics are presented in the “Appendix” in Tables 10 and 11, respectively. A brief overview of the data: the number of FT cases filed annually with a statistical commercial court in the period 2009–2016 averaged at 1722, just one-third of the number of filed WP cases (5049). For non-litigious cases the number was just 5.3. The ratio of average number of resolved FT cases to resolved WP cases was roughly the same, in raw number terms 1530–5009. Furthermore, the average number of serving judges in the analysed period was 5.1 per court, but it ranges in individual courts from 0.7 to 45.3 (full time equivalent) (Fig. 1).

Fig. 1
figure 1

Source: Researchers’ own calculations using data from the Ministry of Justice

Average number of new, pending and resolved ALL cases and average number of serving judges (full time equivalent) in the period 2009–2016.

The analysis of the number of filed and resolved cases over time indicates that both series exhibit a clear upward trend for FT (Fig. 2) and NL (Fig. 3) cases throughout the entire period 2009–2016, but that is not the case for WP cases (Fig. 4). Secondly, the quantity of FT cases filed every year with courts exceeded the number of resolved cases (Fig. 2). Thirdly, the data shows growing backlogs—in 2009 the average number of pending FT cases did not exceed 500, but in 2016 it topped 1500. Similar preliminary conclusions (upward trend, growing backlogs) can be reached for non-litigious cases. In contrast, an almost flat number of pending WP cases over time may suggest that courts performed better in dealing with WP cases than FT cases (Fig. 4). Lastly, in spite of the growing number of filed and pending FT cases, the average number of serving judges per court increased modestly, from 4.8 in 1999 to 6.1 in 2016. The number of auxiliary staff (court clerk and judge assistants) per judge remained relatively stable (Fig. 5).

Fig. 2
figure 2

Source: Researchers’ own calculations using data from the Ministry of Justice

Average number of new, pending and resolved FT (full trial) cases in the period 2009–2016.

Fig. 3
figure 3

Source: Researchers’ own calculations using data from the Ministry of Justice

Average number of new, pending and resolved NL (non-litigious) cases in the period 2009–2016.

Fig. 4
figure 4

Source: Researchers’ own calculations using data from the Ministry of Justice

Average number of new, pending and resolved WP (writ-of-payment) cases in the period 2009–2016.

Fig. 5
figure 5

Source: Researchers’ own calculations using data from the Ministry of Justice

Number of auxiliary staff members per judge in Polish commercial courts in the period 2009–2016.

4 Estimation strategy

This section presents the estimation strategy we applied to analyse determinants of the number of resolved cases filed with first instance commercial courts in Poland and the efficiency of their resolution. As a first step, we focus on average-based panel data models and run pooled OLS regressions. In order to account for potential endogeneity, we then employ two-way fixed effects panel models. In line with previous studies (Dimitrova-Grajzl et al. 2012, 2016) we verify the robustness of our results using instrumental variable to control for possible endogeneity that may stem from reverse causality (e.g. more judges appointed to courts that deal with an increased number of filed cases) or omitted variables. In the second step, we employed stochastic frontier analysis (SFA) to investigate the extent to which caseload and judges determine the maximum feasible number of resolved cases (i.e. the frontier) and factors contributing to court efficiency (i.e. their ability to transform inputs into resolved cases).

Our empirical strategy is therefore based on estimating the production function of a court. However, this do not imply that the non-significance of the coefficient of judges can be treated as a claim that judges do not matter in a court of law. Instead, it should be interpreted as a signal that an increase in the number of judges does not increase the number of cases adjudicated, as output is driven predominantly by the demand for judicial services.

4.1 Panel data approach

We depart from a pooled ordinary least squares model, explaining the number of resolved cases by the number of serving judges and the caseload divided into new cases received by the court in a given year and pending cases (i.e. cases filed to court in earlier years but not yet adjudicated). The pooled model specification we analysed takes the following form:

$$Resolved_{nt} = \beta_{0} + \beta_{1} *New_{nt} + \beta_{2} *Pending_{nt} + \beta_{3} *Judges_{nt} + \varepsilon_{nt}$$
(1)

Although the pooled OLS model does not control for court heterogeneity and time effects, we use clustered errors to avoid biases that may arise due to heteroskedasticity issues. The logarithmic transformation of all variables is used to interpret the estimated coefficients as elasticity of the number of resolved cases to the caseload and number of serving judges. We estimate the coefficients of Eq. 1. For all civil cases combined, as well as separately for the three commercial case types that we were able to distinguish. This granularity is key to investigating differences between the impact of our set of explanatory variables—especially serving judges—on the number of resolved cases.

The civil courts in Poland vary in terms of their individual traits, such as organisation, IT equipment, management skills of the chairman of the court, etc. which can potentially affect their output. Legal changes that happen over time, such as reforms of the judicial system, may also influence output. In order to address potential bias deriving from these two types of unobservable factors, we apply a two-way fixed effects model by augmenting Eq. 1 with court and time fixed effects, since ignoring them may skew the results. Court fixed effects \(u_{n}\) control for all court-level, time-invariant factors that may affect court output (e.g. court expertise) as well as judicial staffing and demand for court services like geographic differences in population density or propensity for litigation. The year dummies \(\lambda_{t}\) control for any unobservable factors that influence the number of adjudications of all civil courts but vary across years, such as reform of the judicial system or any other kind of policy changes impacting the judicial system as a whole as well as business cycle effects. Additionally, we augment our panel model specification with a court-specific linear time trend \((u_{n} t\)) which controls for any unobserved court-specific trends. These may exist, for instance, if some courts exhibit an increasing trend in the number of resolved cases and they experience an increasing trend in judicial staffing or the number of filed cases. If this is the case, data might indicate that there is an association between the number of serving judges or filed cases and the number of resolved cases, even though this association is not causal. The court specific time trend is intended to address these concerns and in effect leads to unbiased estimation results. These features of the panel approach make this strategy preferable to the pooled approach. The two-way fixed effect model specifications we investigated took the following forms:

$$Resolved_{nt} = \beta_{0} + {\upbeta }_{1} *New_{nt} + \beta_{2} *Pending_{nt} + \beta_{3} *Judges_{nt} + u_{n} + \lambda_{t} + \varepsilon_{nt}$$
(2a)
$$Resolved_{nt} = \beta_{0} + {\upbeta }_{1} *New_{nt} + \beta_{2} *Pending_{nt} + \beta_{3} *Judges_{nt} + u_{n} + \lambda_{t} + u_{n} t + \varepsilon_{nt}$$
(2b)

Although two-way fixed effects models address the potential endogeneity problem of the independent variables, their results might still be biased due to possible reverse causality. In our case, reverse causality may occur if the number of serving judges delegated to a court increases as a result of a growing number of new or pending cases that are being filed with a court. Another potential reason of reverse causality is if the parties decide to settle the dispute in a court outside of their jurisdiction. This may happen, for example, if the parties use contractual provisions allowing them to file lawsuits related to their contracts with a court located outside their jurisdiction—an option available under Polish law.

Following Dimitrova-Grajzl et al. (2012, 2016) we address potential reverse causality by employing an instrumental variable technique. Following Wooldridge (2002), we used the panel structure of our data set and instrument first-differences of endogenous variablesFootnote 2 with their second lags. The regression we analysed has the following specification:

$$\Delta Resolved_{nt} = \rho *Resolved_{n,t - 1} + \beta_{1} *\Delta New_{nt} + \beta_{2} *\Delta Pending_{nt} + \beta_{3} *\Delta Judges_{nt} + \Delta \lambda_{t} + \Delta \varepsilon_{nt}$$
(3)

The model specification includes a lagged number of resolved cases as additional regressors, as they might also control for potential endogeneity. Equation 3 can be estimated by using either the two stage least square estimation (IV-2SLS) proposed by Anderson and Hsiao (1982) or the general method of moments (IV-GMM) developed by Arellano and Bond (1991). As stated by Roodman (2009), the GMM estimator better mitigates the trade-off between lag length and sample size than the 2SLS approach. Hence, we estimated coefficients of Eq. 3 using the IV-GMM method.

The validity of results obtained by instrumental variables approach depend heavily on the strength of the applied instruments. As stated by Blundell and Bond (1998) lagged levels of explanatory variables that we use may be weak instruments in the equation estimated on differences. In order to reduce the potential biases and imprecision associated with the difference GMM estimator proposed by Arellano and Bond (1991), we apply the system GMM estimator that combines in a system the regression in differences with the regression in levels (Blundell and Bond 1998). The instruments for the regression in levels are the differences of the corresponding variables (second and further lags). The potential bias resulting from instrument proliferation (Roodman 2009) is addressed by ‘collapsing’ our instruments sets. We verify the consistency of the system GMM estimator and whether lagged explanatory variables are valid instruments by applying two diagnostic tests: the Hansen test of over-identifying restrictions and the Arellano-Bond test for serial correlation of the error term.

4.2 Stochastic frontier approach

Having explored the average impact of caseload and judges on the number of resolved cases we move on to stochastic frontier analysis (SFA) to verify the determinants of effectiveness of commercial courts. This approach explores the robustness of our estimations from the OLS two-way fixed effects and IV-GMM approach regarding the role played by judges in resolving civil cases. Moreover, it investigates the determinants of inefficiency of the commercial courts in Poland taking into account two distinct type of explanatory variables. Firstly, it verifies whether court performance is significantly affected by the number of auxiliary judicial staff members employed in courts—the decisions on the number of auxiliaries in a court are made by the Ministry of Justice and not by the court itself (the respective variables are therefore exogenous). Secondly, it explores to what extent court efficiency in resolving commercial cases is determined by the features of judicial districts they serve (e.g. economic development or business demographics).

The SFA method, first proposed by Aigner et al. (1977), is a parametric technique, but there are two key differences between SFA and the standard “average” OLS or IV-SLS or IV-GMM models. First, in SFA models the theoretical value of the dependent variable is not its expected mean value but expected maximum value (productivity frontier). As the SFA approach is used predominantly to investigate productive units, this maximum value can be interpreted as the maximum number of resolved cases given the available inputs: caseload and judges.

The second difference deals with the distribution of the model residuals (i.e. the difference between the theoretical and the empirical values of the dependent variable). In OLS it is assumed to be a random error with the expected value equal to zero and fixed variance. In contrast, in the SFA approach (estimated by the maximum likelihood method), the residuals are composed of two elements: the first being an independently and identically distributed random error and the second representing the nonpositive inefficiency term—a productive unit’s incapability (be it technical or economic) of reaching its frontier output. This inefficiency can be subsequently explained with another set of explanatory variables.

As stressed by Greene (2005), if court-specific heterogeneity is not adequately controlled for, the estimated inefficiency may be picking up court-specific heterogeneity in addition to or even instead of inefficiency. Thus, the inability of a model to estimate individual effects in addition to the inefficiency effect poses a problem for empirical research. Hence, the inefficiency effect and the time-invariant court-specific effect are different and should be accounted for separately in the estimation. To address these concerns, Greene (2005) proposed the true fixed effects model, which accounts for unobserved court specific heterogeneity along with time varying inefficiency. The SFA regression we applied has the following specification:

$$\begin{aligned} & y_{it} = \alpha_{i} + {\varvec{X}}_{{{\varvec{it}}}} {\varvec{\beta}} - u_{it} + v_{it} \\ & v_{it} \sim N\left( {0,\sigma_{v}^{2} } \right) \\ & u_{it} = \gamma_{0} + {\varvec{Z}}\_1_{{{\varvec{it}}}} {\varvec{\gamma}}_{1} + {\varvec{Z}}\_2_{{{\varvec{it}}}} {\varvec{\gamma}}_{2} + u_{i}^{*} \\ & u_{i}^{*} \sim N^{ + } \left( {\mu ,\sigma_{u}^{2} } \right),\quad i = 1, \ldots ,N,\quad t = 1, \ldots , T \\ \end{aligned}$$
(4)

where \(y_{it}\) represents the production frontier (i.e. maximum feasible number of resolved cases) and vector \({\varvec{X}}_{{{\varvec{it}}}}\) includes court inputs used for adjudication of filed cases, namely new and pending cases as well as number of judges (all transformed into logarithms). The term \(\alpha_{i}\) denotes the time-invariant court-specific effects controlling for heterogeneity of courts in the panel. This term separates time-variant inefficiency term and time-invariant individual fixed effects. The inefficiency term \(u_{it}\) is explained by internal court-specific variables included in vector \({\varvec{Z}}\_1_{{{\varvec{it}}}}\) (i.e. court clerks to judges ratio, assistants to judges ratio and civil servants to judges ratio) and external region-specific variables included in vector \({\varvec{Z}}\_2_{{{\varvec{it}}}}\). The latter set of covariates captures economic features of each court jurisdiction. This vector consists of annual gross salary, annual number of registered businesses per 10 K inhabitants and the annual ratio of average number of privately-owned firms to the total number of firms registered in each court jurisdiction. These variables can be interpreted as a proxy of regional economic development that is likely to affect the difficulty and complexity of cases adjudicated by the courts. Hence, they might have an impact on court efficiency in resolving filed cases. We run SFA regressions for all cases combined as well as for three different types of cases.

5 Results

5.1 Pooled OLS model

The result of pooled OLS model (Eq. 1)—that is the starting point of our analysis—points out that the principal driving force of commercial courts in Poland is the demand for their services represented by the number of new cases filed to courts (Table 1). For all analysed cases the point estimate indicates that a 10% increase in the number of newly filed commercial cases is followed on average by a 9.7% growth in the number of resolved cases (col. 1). This suggests there is an almost one-to-one relationship between the number of new cases and the cases resolved. The elasticity of the number of resolved cases to the number of filed cases is also high for cases requiring a full court trial (col. 2) and writ-of-payment cases (col. 4); a relatively lower elasticity was established for non-litigious cases (col. 4). The estimated coefficients of the pending cases are all positive and statistically significant, but the estimates are much lower than for new ones. This indicates that the commercial courts’ output is foremost driven by the newly filed cases. In contrast, the role played by pending cases in explaining the court output turned out to be almost negligible. The results of pooled OLS estimations point out that the number of judges contributes to court output, but this relationship was statistically significant only for cases requiring a full trial (col. 2). Adopting a causal interpretation a 10% increase in the number of judges was associated with less than 1% growth in the number of full trial cases adjudicated by courts.

Table 1 Baseline regression results: pooled OLS

The estimates of pooled OLS regression may be biased due to either their inability to control heterogeneity between courts or endogeneity resulting from possible omitted variables. They constitute only a starting point for research and should be interpreted with caution. We address these issues by applying two-way fixed effects models and instrumental variable models, as presented in the following sections.

5.2 Two-way fixed effects models

The estimates of two-way fixed effects model (Eq. 2a) are presented in Table 2. The results confirm the findings from pooled OLS regressions that the key driving force of commercial court output is the demand for their judicial services for all distinguished type of cases. Adopting a causal interpretation, the results from two-way fixed effects models indicate that a 10% increase in the number of all newly filed cases lead to a 9.5% increase in the number of resolved cases. The estimates from regressions that account for fixed effects indicate that pending cases have slightly larger impact on the number of resolved cases of FT type than we get in pooled models (col. 2). However, contrary to pooled OLS, the results from two-way fixed effects models indicate that the coefficients of serving judges are statistically significant for all cases (col. 1) and for cases that require a full court trial (col. 2). In a causal interpretation, the point estimates indicate that a 10% increase in the number of serving judges is associated with a 2% growth in the number of adjudicated cases of FT type and only 0.6% increase in the number of all cases combined. This means that having controlled for court and year fixed effects, the estimates of judges dealing with FT type cases increased almost three times compared to the results obtained from pooled regression.

Table 2 Two-way fixed effect models

The introduction of the court time trends to the models does not affect the significance of coefficients, as their values do not differ greatly from those presented in Table 2. The only difference of note is that the impact of judges on the number of adjudicated FT type cases is even higher when we account for court specific time trend (Table 3). Accordingly, it can be stated that the results of the fixed effects model are robust to the biases that may arise owing to time trends. In both model specifications the coefficient of judges turned out to be statistically insignificant for writ-of-payment and non-litigious cases (col. 3, 4).

Table 3 Two-way fixed effect models (including court-specific time trends)

5.3 Instrumental variable approach

The results in Table 4 present the coefficients of Eq. 3 that were estimated by applying the GMM instrumental variables approach. They confirm the findings from two-way panel models that commercial court output in Poland is mainly driven by the demand for their services. The coefficients of new and pending cases were significant for all cases combined as well as for three type of cases we distinguished.

Table 4 Instrumental variable regression results: GMM—Arellano-Bond estimator

The results of the IV-GMM approach indicate that the number of serving judges are significant for cases requiring a full court trial (col. 2). The estimates of judges turned out to be insignificant for all cases and writ-of-payment. These IV GMM findings are consistent with the results of two way FE panel models with respect to individual case types. However, for all cases pooled together the significance of judges disappears, pointing to endogeneity issues similar to ones observed by Dimitrova-Grajzl et al. (2012, 2016). Moreover, it is worth mentioning that having controlled for potential endogeneity by the GMM approach, the significant coefficient of judges for full trial cases (0.30) appears to be higher than the estimates in the two-way panel models. Under casual interpretation, the GMM results indicate that a 10% increase of serving judges is associated with a 3.0% increase in the number of resolved FT type cases. The Hansen test results confirm that in all model specifications our instruments are not correlated with residuals, whereas Arellano-Bond serial correlation test results support the assumption that errors exhibit no second-order autocorrelation.

Although our results underline that commercial court output in Poland is driven primarily by the demand for their services, they also indicate that serving judges contribute significantly to the number of resolved cases. These findings are confirmed by the estimates from two-way fixed effects models that address potential endogeneity as well as by the estimates from the instrumental variables GMM approach. Therefore, our findings shed more light on the role played by judges in adjudication of cases filed to courts, as is discussed in similar studies on judiciary performance in other countries. Contrary to a large group of previous research investigating the judiciaries of Israel, Slovenia and Bulgaria, we find evidence that commercial courts in Poland adjudicating in full trial cases are driven not only by the demand for service but also by their judicial staff.

The estimates from Eqs. 1 and 2 can be directly compared to the results of other research that applied analogous methodology to verify the significance of judges to resolving cases (i.e. pooled and two-way FE). Table 5 lists the estimates of judges in our study and compares them with the results of analogous research carried out in other countries. Our results for civil courts from two-way FE models indicate that the number of judges does have a significant and positive impact on the number of resolved full trial cases. Hence, in this aspect they are consistent with the findings of Dimitrova-Grajzl et al. (2012), who analysed cases resolved in local courts in Slovenia (col. 2). On the other hand, our results for all cases pooled support findings on the insignificant role of judges in resolving cases in district courts in Slovenia (Dimitrova-Grajzl et al. 2012), district courts in Bulgaria (Dimitrova-Grajzl et al. 2016) and high and district courts in Israel (Beenstock and Haitovsky 2004).

Table 5 Impact of judges on the number of resolved cases—a comparison of the estimates for Poland with the results for other countries

5.4 Court efficiency: stochastic frontier analysis

The estimates of Eq. 4 for all commercial cases combined are presented in Table 6. They indicate that the primary driver of the total number of resolved cases of any type is the number of new cases filed to court. This result confirms that commercial courts in Poland are mainly driven by the demand for judicial services provided by courts. The coefficient of judges turned out to be highly statistically significant only in one specification (col. 5), in the other cases it was either insignificantly different from zero or significant, but at a very low significance level (col. 2, 6). The auxiliary court staff to judges ratios do not appear to significantly affect court inefficiency in resolving filed cases. On the contrary, court performance was adversely affected by variables capturing economic development of court jurisdiction, i.e. average annual income per capita and share of privately owned enterprises in the total number of registered enterprises. Higher inefficiency of commercial courts serving in more developed regions may result from greater complexity and difficulty of filed cases than is the case in less developed court jurisdictions. In the following steps we investigated whether and to what extent these results differ among three types of distinguished commercial cases.

Table 6 Stochastic frontier analysis: all cases

Contrary to the results for all types of commercial cases discussed previously, the estimates of Eq. 4 for cases requiring a full court trial indicate that judges determine to a significant degree the maximum feasible number of resolved cases of this type (Table 7). These results are consistent with our findings of panel data approach. The SFA analysis indicates that court efficiency in adjudication of FT cases depends significantly and positively on the number of judge assistants: the higher ratio of judge assistants to judges, the lower the court inefficiency in resolving cases requiring a full court trial. Analogous to the findings obtained for all commercial courts combined, the results for FT cases also show that the higher the share of privately owned enterprises in the total number of registered firms, the lower the court efficiency is. As was underlined before, commercial cases that need a full trial filed to courts serving in more developed regions might be more complex and thus they negatively impact court efficiency statistics.

Table 7 Stochastic frontier analysis—cases demanding a full court trial

The results of SFA conducted for writ-of-payment cases indicate that courts dealing with this type of cases are entirely driven by the demand for judicial services captured by the number of newly filed cases (Table 8). The coefficient of judges for this type of case turned out to be insignificant in all specifications. Contrary to the results we obtained for all and FT cases, the estimates point out that the higher court clerks to judges ratio, the higher the court efficiency in resolving writ-of-payment cases. Moreover, the results indicate that court efficiency in dealing with writ-of-payment cases is subdued in more economically developed court jurisdictions.

Table 8 Stochastic frontier analysis—writ-of-payment cases

Finally, we explored court efficiency in resolving non-litigious commercial cases (Table 9). As for writ-of-payment cases, we do not find a significant role of judges in determining the maximum feasible number of resolved cases. This indicates that court performance is significantly affected by the number of judges only for cases that require a full court trial. For the other types of cases commercial court activity is entirely dependent on the demand for judicial services. Contrary to previous findings, the estimates suggest that commercial court efficiency in resolving non-litigious cases is positively associated with economic development, as proxied by the share of privately owned enterprises. Surprisingly, in some specifications we found that the higher legal clerks to judges ratio, the lower court efficiency in resolving non-litigious cases, but this unexpected result is not robust.

Table 9 Stochastic frontier analysis—non-litigous cases

6 Conclusions

In this study we analysed the performance of first-instance commercial courts adjudicating in small-value disputes between businesses in Poland. The research focuses on the determinants of court output measured by the number of resolved cases and on the factors influencing court efficiency in resolving cases of different types. We shed more light on a topic discussed widely in recent literature, namely the extent to which court output is determined solely by the demand for justice services and whether the number of serving judges can affect court performance. The unique dataset we applied for this research enabled us to analyse all cases combined as well as three distinguished types of commercial cases that differ as regards the required involvement of judges in their resolution. Whereas we found that the judicial system in Poland is on average driven by the demand for justice, our results indicate that an increase in the number of judges can significantly enhance the number of resolved cases that require a full court trial. These findings are robust to potential endogeneity, which we addressed by applying fixed effects regressions as well as an instrumental variable GMM approach. Alternatively the reduction of judges involvement in other types of cases which do not require their direct presence may boost their productivity in full trials too.

In addition to traditional analysis of the determinants of court output measured by the number of resolved cases, we also investigated determinants of court efficiency by applying SFA. The results indicate that court efficiency is significantly associated with some auxiliary court staff members and variables capturing economic development of court jurisdiction. Specifically, we found that judge assistants increase court efficiency in resolving commercial cases requiring a full trial and court clerks boost court efficiency in resolving writ-of-payment cases. Moreover, our results provide some tentative evidence that court efficiency is dampened in more economically developed regions. The findings can be explained by the greater complexity and difficulty of commercial cases filed to courts in those regions. However, these findings have to be interpreted with caution since due to data limitations we investigated only some proxies of economic development and not the actual complexity of filed cases.

The determinants of judicial systems definitely merit greater research. One particular issue omitted from this study is the performance of Polish labor courts resolving respectively disputes between businesses and employer-employee conflicts.