24h購物| | PChome| 登入
2013-03-17 22:50:23| 人氣1,077| 回應0 | 上一篇 | 下一篇

[UVA][二分答案] 11860 - Document Analyzer

推薦 0 收藏 0 轉貼0 訂閱站台

B

Document Analyzer

Input: Standard Input

Output: Standard Output

 

You work in a leading software development company. As you are great in coding, most of the critical tasks are allotted for you. You like the challenge and you feel excited to solve those problems.

 

Recently your company is developing a project named Document Analyzer. In this project you are assigned a task; of course a critical task. The task is that you are given a document consisting of lowercase letters, numbers and punctuations. You have to analyze the document and separate the words first. Words are consecutive sequences of lower case letters. After listing the words, in the order same as they occurred in the document, you have to number them from 1, 2, ... n. After that you have to find the range p and q (p ≤ q) such that all kinds of words occur between p and q (inclusive). If there are multiple such solutions you have to find the one where the difference of p and q is smallest. If still there is a tie, then find the solution where p is smallest.

Since you do have other works to do, you have to solve this task within 5 hours.

 

Input

First line of input will contain T (≤ 20) denoting number of documents.

Each document will be denoted by one or more lines, each line having no more than 150 characters. A document will contain either lowercase letters or numbers or punctuations. The last line of a document will contain the word ‘END’ which is of course not the part of the document. You can assume that a document will contain between 1 and 105 words (inclusive).

 

Output

For each document, print the document number first. After that, print p and q as described above. For exact formatting, see the samples.

 

Sample Input                             Output for Sample Input

3
1. a case is a case, 
2. case is not a case~
END
a b c d e
END
a@#$a^%a a a
b b----b b++12b
END
 
Document 1: 6 9
Document 2: 1 5
Document 3: 5 6

The input file size is about 9 MB, so use faster input techniques.


Problemsetter: Jane Alam Jan, Special Thanks: Abdullah-al-Mahmud


題目要求找一個區間,這個區間出現過整個文本的所有單字。

其中區間長度越小越好,同時左端點越小越好。

二分長度 O(logn),然後 O(n) 檢查。

效率 O(nlogn)


#include <stdio.h>
#include <iostream>
#include <map>
#include <string.h>
using namespace std;
map<string, int> WORD;
int list[100000];
int mark[100000];
void WORDclear(int i, int L) {
    while(i >= 0 && L >= 0)
        mark[list[i]] = 0, i--, L--;
}
int check(int L, int n, int tot) {
    int i, cnt = tot;
    map<string, int>::iterator it;
    for(i = 0; i < L; i++) {
        if(mark[list[i]] == 0)
            cnt--;
        mark[list[i]]++;
    }
    if(cnt == 0) {
        WORDclear(L-1, L);
        return 0;
    }
    for(i = L; i < n; i++) {
        if(mark[list[i]] == 0)
            cnt--;
        mark[list[i]]++;
        if(mark[list[i-L]] == 1)
            cnt++;
        mark[list[i-L]]--;
        if(cnt == 0) {
            WORDclear(i, L);
            return i-L+1;
        }
    }
    WORDclear(n-1, L);
    return -1;
}
int main() {
    int t, i, j, k, cases = 0;
    char line[1024], buf[1024];
    scanf("%d", &t);
    gets(line);
    while(t--) {
        WORD.clear();
        int n = 0, label = 1, x;
        map<string, int>::iterator it;
        while(gets(line) && strcmp("END", line)) {
            for(i = 0; line[i]; i++) {
                if(line[i] >= 'a' && line[i] <= 'z') {
                    int idx = 0;
                    while(line[i] && line[i] >= 'a' && line[i] <= 'z')
                        buf[idx++] = line[i++];
                    buf[idx] = '\0', i--;
                    it = WORD.find(buf);
                    if(it == WORD.end()) {
                        x = label;
                        WORD[buf] = label++;
                    } else
                        x = it->second;
                    list[n++] = x;
                }
            }
        }
        int L = 0, R = n, m;
        int p, q, tp;
        while(L <= R) { // length of [p, q]
            m = (L+R)/2;
            tp = check(m, n, label-1);
            if(tp >= 0)
                R = m-1, p = tp, q = p+m-1;
            else
                L = m+1;
        }
        printf("Document %d: %d %d\n", ++cases, p+1, q+1);
    }
    return 0;
}

台長: Morris
人氣(1,077) | 回應(0)| 推薦 (0)| 收藏 (0)| 轉寄
全站分類: 不分類 | 個人分類: UVA |
此分類下一篇:[UVA][Queue] 10901 - Ferry Loading III
此分類上一篇:[UVA][射線法、線段交] 10348 - Submarines

是 (若未登入"個人新聞台帳號"則看不到回覆唷!)
* 請輸入識別碼:
請輸入圖片中算式的結果(可能為0) 
(有*為必填)
TOP
詳全文