Test-per-clock BIST scheme has the advantages of very short test application time and small test data volume. However, conventionally this scheme needs an extra parallel response monitor for response analysis that may lead to large area overhead. This paper presents a new test-per-clock BIST method that can perform both pattern generation and response compression concurrently in the same LFSR-based design so as to reduce the area overhead. Furthermore, some internal nets are employed in two ways during test application to help reduce test time and test data volume: 1) as the observation points to enhance fault detectability and 2) as the test data provider for reseeding the LFSR. These two ways lead to the benefits that all required patterns can be generated on chip and at-speed testing can be carried out without using any external or internal storage device. Experimental results show that the presented method can achieve 100% fault coverage in very short time for large ISCAS (IWLS) benchmark circuits using 0.09% (0.03%) of internal nets with 9.29% (8.26%) extra area overhead respectively. When compared with a conventional scan-based design, the area overhead is small considering the features of test-per-clock and no requirement of data storage or expensive test equipment.